Project

General

Profile

Actions

action #104220

closed

openQA-inopenQA tests failing in boot step

Added by tinita almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-12-21
Due date:
% Done:

0%

Estimated time:

Description

Observation

Tests are failing since monday evening:
https://openqa.opensuse.org/group_overview/24?limit_builds=20

It just hangs in the boot step.

From autoinst-log.txt, heavily redacted because all of these messages occur several times:

[2021-12-21T12:28:46.108910+01:00] [debug] considering VNC stalled, no update for 4.00 seconds
[2021-12-21T12:29:03.580402+01:00] [debug] no match: 360.9s, best candidate: gnome-desktop-20210420 (0.73)
[2021-12-21T12:29:04.302498+01:00] [debug] no change: 359.9s
[2021-12-21T12:32:48.935201+01:00] [debug] led state 0 1 1 -261
[...]
[2021-12-21T12:36:40.549675+01:00] [debug] post_fail_hook failed: command 'cd /root/openQA' timed out at openqa/lib/openQAcoretest.pm line 16.
[...]
[2021-12-21T12:36:40.775669+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"46782","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_DISCONNECTED","timestamp":{"microseconds":775059,"seconds":1640086067}}
[2021-12-21T12:36:40.775813+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"48106","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_CONNECTED","timestamp":{"microseconds":775678,"seconds":1640086067}}
[2021-12-21T12:36:40.775955+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"48106","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_INITIALIZED","timestamp":{"microseconds":780239,"seconds":1640086067}}
[2021-12-21T12:36:40.776087+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"48106","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_DISCONNECTED","timestamp":{"microseconds":109228,"seconds":1640086126}}

Acceptance criteria

  • AC1: install/boot in openQa-in-openQA tests is run successfully

Suggestions

  • Investigate why we didn't find out about this earlier
  • Determine the failure(s)
Actions #1

Updated by tinita almost 3 years ago

  • Subject changed from openQA-inopenQA tests failing to openQA-inopenQA tests failing in boot step
Actions #2

Updated by livdywan almost 3 years ago

  • Description updated (diff)
Actions #3

Updated by tinita almost 3 years ago

Investigate why we didn't find out about this earlier

I didn't look into the email folder anymore yesterday when this started to happen (around 15:39 UTC).

Everyone subscribed to https://build.opensuse.org/project/show/devel:openQA should have gotten an email about this.

Cris, now that you ask, I don't see you in the recipient list.

Actions #4

Updated by livdywan almost 3 years ago

  • Description updated (diff)
Actions #5

Updated by tinita almost 3 years ago

I added "Add devel:openQA on OBS to your watchlist" to the wiki. But not sure if that is what is needed.
Be sure to also check "New comment for project created" and "New comment for package created" where you are maintainer.

Actions #6

Updated by livdywan almost 3 years ago

tinita wrote:

I added "Add devel:openQA on OBS to your watchlist" to the wiki. But not sure if that is what is needed.
Be sure to also check "New comment for project created" and "New comment for package created" where you are maintainer.

Apparently that was it

Actions #7

Updated by livdywan almost 3 years ago

  • Description updated (diff)
  • Status changed from New to Workable
  • Assignee set to livdywan

tinita wrote:

Tests are failing since monday evening:
https://openqa.opensuse.org/group_overview/24?limit_builds=20

It just hangs in the boot step.

https://openqa.opensuse.org/tests/2099602#step/boot/7

I guess I'll take a look, since I'm approaching alert fatigue and either come up with a fix or switch off the tests

Actions #8

Updated by livdywan almost 3 years ago

  • There's two relatively recent changes to test_running in os-autoinst-distri-openQA which seem unrelated to me.
  • There's nothing obvious in settings besides differences between workers.
  • [2021-12-20T15:24:48.960239+01:00] [debug] >>> testapi::_handle_found_needle: found openqa-boot-menu-Tumbleweed-20190329, similarity 1.00 @ 65/11 in the last good job seems to have made way for this:

    [2021-12-23T15:44:43.212072+01:00] [debug] no change: 188.0s
    [2021-12-23T15:44:44.093770+01:00] [debug] considering VNC stalled, no update for 4.00 seconds
    [...]
    [2021-12-23T15:47:53.487010+01:00] [debug] >>> testapi::_check_backend_response: match=openqa-desktop timed out after 500 (assert_screen)
    [2021-12-23T15:47:53.573650+01:00] [info] ::: basetest::runtest: # Test died: no candidate needle with tag(s) 'openqa-desktop' matched

Notice incredulous amounts of "stalled" and "no change" messages.

Still not sure what changed here.

Actions #9

Updated by livdywan almost 3 years ago

  • Status changed from Workable to Feedback

Tests started passing again.

cdywan wrote:

[2021-12-23T15:44:43.212072+01:00] [debug] no change: 188.0s
[2021-12-23T15:44:44.093770+01:00] [debug] considering VNC stalled, no update for 4.00 seconds
[...]
[2021-12-23T15:47:53.487010+01:00] [debug] >>> testapi::_check_backend_response: match=openqa-desktop timed out after 500 (assert_screen)
[2021-12-23T15:47:53.573650+01:00] [info] ::: basetest::runtest: # Test died: no candidate needle with tag(s) 'openqa-desktop' matched

Notice incredulous amounts of "stalled" and "no change" messages.

These are indeed gone. I don't fully grok what's (not) causing them and existing tickets don't obviously relate.

Actions #10

Updated by livdywan almost 3 years ago

Maybe related to #104517 in a way I don't understand?

Actions #11

Updated by livdywan almost 3 years ago

  • Status changed from Feedback to Resolved

Brought it up briefly in the Weekly, since we never found out the cause of this, but we don't know either way.

Actions

Also available in: Atom PDF