action #104220
closedopenQA-inopenQA tests failing in boot step
Description
Observation¶
Tests are failing since monday evening:
https://openqa.opensuse.org/group_overview/24?limit_builds=20
It just hangs in the boot step.
From autoinst-log.txt, heavily redacted because all of these messages occur several times:
[2021-12-21T12:28:46.108910+01:00] [debug] considering VNC stalled, no update for 4.00 seconds
[2021-12-21T12:29:03.580402+01:00] [debug] no match: 360.9s, best candidate: gnome-desktop-20210420 (0.73)
[2021-12-21T12:29:04.302498+01:00] [debug] no change: 359.9s
[2021-12-21T12:32:48.935201+01:00] [debug] led state 0 1 1 -261
[...]
[2021-12-21T12:36:40.549675+01:00] [debug] post_fail_hook failed: command 'cd /root/openQA' timed out at openqa/lib/openQAcoretest.pm line 16.
[...]
[2021-12-21T12:36:40.775669+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"46782","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_DISCONNECTED","timestamp":{"microseconds":775059,"seconds":1640086067}}
[2021-12-21T12:36:40.775813+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"48106","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_CONNECTED","timestamp":{"microseconds":775678,"seconds":1640086067}}
[2021-12-21T12:36:40.775955+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"48106","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_INITIALIZED","timestamp":{"microseconds":780239,"seconds":1640086067}}
[2021-12-21T12:36:40.776087+01:00] [debug] EVENT {"data":{"client":{"family":"ipv4","host":"127.0.0.1","service":"48106","websocket":false},"server":{"auth":"none","family":"ipv4","host":"0.0.0.0","service":"6001","websocket":false}},"event":"VNC_DISCONNECTED","timestamp":{"microseconds":109228,"seconds":1640086126}}
Acceptance criteria¶
- AC1:
install/boot
in openQa-in-openQA tests is run successfully
Suggestions¶
- Investigate why we didn't find out about this earlier
- Determine the failure(s)
Updated by tinita almost 3 years ago
- Subject changed from openQA-inopenQA tests failing to openQA-inopenQA tests failing in boot step
Updated by tinita almost 3 years ago
Investigate why we didn't find out about this earlier
I didn't look into the email folder anymore yesterday when this started to happen (around 15:39 UTC).
Everyone subscribed to https://build.opensuse.org/project/show/devel:openQA should have gotten an email about this.
Cris, now that you ask, I don't see you in the recipient list.
Updated by tinita almost 3 years ago
I added "Add devel:openQA on OBS to your watchlist" to the wiki. But not sure if that is what is needed.
Be sure to also check "New comment for project created" and "New comment for package created" where you are maintainer.
Updated by livdywan almost 3 years ago
tinita wrote:
I added "Add devel:openQA on OBS to your watchlist" to the wiki. But not sure if that is what is needed.
Be sure to also check "New comment for project created" and "New comment for package created" where you are maintainer.
Apparently that was it
Updated by livdywan almost 3 years ago
- Description updated (diff)
- Status changed from New to Workable
- Assignee set to livdywan
tinita wrote:
Tests are failing since monday evening:
https://openqa.opensuse.org/group_overview/24?limit_builds=20It just hangs in the boot step.
https://openqa.opensuse.org/tests/2099602#step/boot/7
I guess I'll take a look, since I'm approaching alert fatigue and either come up with a fix or switch off the tests
Updated by livdywan almost 3 years ago
- There's two relatively recent changes to test_running in os-autoinst-distri-openQA which seem unrelated to me.
- There's nothing obvious in settings besides differences between workers.
[2021-12-20T15:24:48.960239+01:00] [debug] >>> testapi::_handle_found_needle: found openqa-boot-menu-Tumbleweed-20190329, similarity 1.00 @ 65/11
in the last good job seems to have made way for this:[2021-12-23T15:44:43.212072+01:00] [debug] no change: 188.0s
[2021-12-23T15:44:44.093770+01:00] [debug] considering VNC stalled, no update for 4.00 seconds
[...]
[2021-12-23T15:47:53.487010+01:00] [debug] >>> testapi::_check_backend_response: match=openqa-desktop timed out after 500 (assert_screen)
[2021-12-23T15:47:53.573650+01:00] [info] ::: basetest::runtest: # Test died: no candidate needle with tag(s) 'openqa-desktop' matched
Notice incredulous amounts of "stalled" and "no change" messages.
Still not sure what changed here.
Updated by livdywan almost 3 years ago
- Status changed from Workable to Feedback
Tests started passing again.
- The only change I'm aware of is Remove log messages because missing details-*-.json files are expected, which I couldn't confirm to affect this.
NEEDLES_GIT_HASH
is8de82ea09f4b696fedf3ecbc528e92841689f9a7
in all cases, so a needle change won't explain it either.- No recent change in os-autoinst-distri-opensuse seems related.
cdywan wrote:
[2021-12-23T15:44:43.212072+01:00] [debug] no change: 188.0s [2021-12-23T15:44:44.093770+01:00] [debug] considering VNC stalled, no update for 4.00 seconds [...] [2021-12-23T15:47:53.487010+01:00] [debug] >>> testapi::_check_backend_response: match=openqa-desktop timed out after 500 (assert_screen) [2021-12-23T15:47:53.573650+01:00] [info] ::: basetest::runtest: # Test died: no candidate needle with tag(s) 'openqa-desktop' matched
Notice incredulous amounts of "stalled" and "no change" messages.
These are indeed gone. I don't fully grok what's (not) causing them and existing tickets don't obviously relate.
Updated by livdywan almost 3 years ago
Maybe related to #104517 in a way I don't understand?
Updated by livdywan almost 3 years ago
- Status changed from Feedback to Resolved
Brought it up briefly in the Weekly, since we never found out the cause of this, but we don't know either way.