Project

General

Profile

Actions

action #165923

open

[qa-tools][vmware][spikesolution][timeboxed:20h] VNC reconnect after reboot size:S

Added by mloviska 4 months ago. Updated about 2 months ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2024-08-28
Due date:
% Done:

0%

Estimated time:

Description

Observation

openQA test in scenario sle-micro-6.1-Base-VMware-x86_64-cloud-init@svirt-vmware70 fails in
suseconnect_scc

VNC has been shown to be unstable in VMWare. Especially when a reboot occurs the SUT usually waits for the grub2 screen and tests should confirm the default option in grub2.
Most of the failures that are currently occurring in VMWare are caused by a missed 'ret' key in grub2 or the VNC connection getting stalled.

This was not really a problem with SLES, because reboots are not occurring that often as it is in SLEM tests.

[2024-08-28T11:34:27.435259+02:00] [debug] [pid:40411] tests/console/suseconnect_scc.pm:44 called transactional::process_reboot -> products/sle-micro/../../lib/transactional.pm:118 called testapi::wait_screen_change
[2024-08-28T11:34:27.435393+02:00] [debug] [pid:40411] <<< testapi::wait_screen_change(timeout=10, similarity_level=50)
[2024-08-28T11:34:27.437833+02:00] [debug] [pid:40411] tests/console/suseconnect_scc.pm:44 called transactional::process_reboot -> products/sle-micro/../../lib/transactional.pm:118 called testapi::wait_screen_change -> products/sle-micro/../../lib/transactional.pm:118 called testapi::send_key
[2024-08-28T11:34:27.437971+02:00] [debug] [pid:40411] <<< testapi::send_key(key="ret", wait_screen_change=0)
[2024-08-28T11:34:57.154911+02:00] [debug] [pid:40412] considering VNC stalled, no update for 30.03 seconds
[2024-08-28T11:34:57.157505+02:00] [debug] [pid:40412] Establishing VNC connection over WebSockets via https://esxi7.qa.suse.cz
[2024-08-28 11:34:58.24094] [42759] [info] Establishing WebSocket connection to wss://esxi7.qa.suse.cz:443/ticket/6c8025b2e3c3a460
[2024-08-28 11:34:58.24231] [42759] [info] Client accepted
[2024-08-28 11:34:58.26160] [42759] [info] WebSocket connection established
[2024-08-28T11:35:01.079317+02:00] [debug] [pid:40411] >>> testapi::wait_screen_change: screen change seen after 33.3535449504852 seconds (similarity: 21.3448611590361)
[2024-08-28T11:35:01.079747+02:00] [debug] [pid:40411] tests/console/suseconnect_scc.pm:44 called transactional::process_reboot -> products/sle-micro/../../lib/transactional.pm:120 called testapi::assert_screen

So the problem is that the VNC connection gets lost but our current stall-detection is not good enough because it does not re-send any possibly lost keys/clicks.

Reproducible

Fails since (at least) Build 3.12

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

Acceptance criteria

  • AC1: We know how the problematic situation can be fixed or worked around.

Suggestions

  • Maybe the situation can be detected
    • Re-establish the VNC connection from test code or from testapi (taking care dewebsockify is also restarted as needed)

Related issues 3 (1 open2 closed)

Related to Containers and images - action #164922: [vmware] add combustion & ignition support Resolvedmloviska2024-08-05

Actions
Related to Containers and images - action #162941: Add job group definitions for SLEM 6.0 to QAC-yamlResolvedmdati2024-06-27

Actions
Related to openQA Tests (public) - action #165890: [qe-core][qe-virt]test fails in host_config - Cannot login the system with enter 'ret' keyNew2024-08-28

Actions
Actions #1

Updated by mloviska 4 months ago

  • Related to action #164922: [vmware] add combustion & ignition support added
Actions #2

Updated by mloviska 4 months ago

  • Project changed from openQA Tests (public) to openQA Project (public)
  • Category deleted (Bugs in existing tests)
Actions #3

Updated by tinita 4 months ago

  • Target version set to Ready
Actions #4

Updated by mdati 4 months ago

Probably similar or same issue dealed in https://progress.opensuse.org/issues/165890

Actions #5

Updated by mdati 4 months ago

  • Related to action #162941: Add job group definitions for SLEM 6.0 to QAC-yaml added
Actions #6

Updated by mkittler 4 months ago

  • Related to action #165890: [qe-core][qe-virt]test fails in host_config - Cannot login the system with enter 'ret' key added
Actions #7

Updated by livdywan 4 months ago

  • Subject changed from [qa-tools][vmware] VNC reconnect after reboot to [qa-tools][vmware][spikesolution][timeboxed:20h] VNC reconnect after reboot size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #8

Updated by okurz 3 months ago

  • Category set to Feature requests
  • Target version changed from Ready to future

I am kindly asking anybody outside the tools team to look into this especially when you have more experience with VMWare based openQA tests. We have really good code coverage of tests within os-autoinst nowadays so it should be relatively easy to integrate code changes.

Actions #9

Updated by openqa_review about 2 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: cloud-init@svirt-vmware70
https://openqa.suse.de/tests/15846827#step/first_boot/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions

Also available in: Atom PDF