action #165923
Updated by livdywan 9 months ago
## Observation openQA test in scenario sle-micro-6.1-Base-VMware-x86_64-cloud-init@svirt-vmware70 fails in [suseconnect_scc](https://openqa.suse.de/tests/15282809/modules/suseconnect_scc/steps/22) VNC has been shown to be unstable instable in VMWare. Especially VMWare, especially when a reboot occurs occurs, the SUT usually waits for the grub2 screen and tests should confirm the default option in grub2. Most of the failures that are currently occurring in VMWare are caused by a missed 'ret' key in confirm grub2 or that the VNC connection getting gets stalled. This was not really a problem with SLES, because reboots are not occurring that often as it is in SLEM tests. ``` [2024-08-28T11:34:27.435259+02:00] [debug] [pid:40411] tests/console/suseconnect_scc.pm:44 called transactional::process_reboot -> products/sle-micro/../../lib/transactional.pm:118 called testapi::wait_screen_change [2024-08-28T11:34:27.435393+02:00] [debug] [pid:40411] <<< testapi::wait_screen_change(timeout=10, similarity_level=50) [2024-08-28T11:34:27.437833+02:00] [debug] [pid:40411] tests/console/suseconnect_scc.pm:44 called transactional::process_reboot -> products/sle-micro/../../lib/transactional.pm:118 called testapi::wait_screen_change -> products/sle-micro/../../lib/transactional.pm:118 called testapi::send_key [2024-08-28T11:34:27.437971+02:00] [debug] [pid:40411] <<< testapi::send_key(key="ret", wait_screen_change=0) [2024-08-28T11:34:57.154911+02:00] [debug] [pid:40412] considering VNC stalled, no update for 30.03 seconds [2024-08-28T11:34:57.157505+02:00] [debug] [pid:40412] Establishing VNC connection over WebSockets via https://esxi7.qa.suse.cz [2024-08-28 11:34:58.24094] [42759] [info] Establishing WebSocket connection to wss://esxi7.qa.suse.cz:443/ticket/6c8025b2e3c3a460 [2024-08-28 11:34:58.24231] [42759] [info] Client accepted [2024-08-28 11:34:58.26160] [42759] [info] WebSocket connection established [2024-08-28T11:35:01.079317+02:00] [debug] [pid:40411] >>> testapi::wait_screen_change: screen change seen after 33.3535449504852 seconds (similarity: 21.3448611590361) [2024-08-28T11:35:01.079747+02:00] [debug] [pid:40411] tests/console/suseconnect_scc.pm:44 called transactional::process_reboot -> products/sle-micro/../../lib/transactional.pm:120 called testapi::assert_screen ``` So the problem is that the VNC connection gets lost but our current stall-detection is not good enough because it does not re-send any possibly lost keys/clicks. * https://openqa.suse.de/tests/15282809#step/suseconnect_scc/22 * https://openqa.suse.de/tests/15282550#step/suseconnect_scc/12 * https://openqa.suse.de/tests/15282568#step/host_config/11 * https://openqa.suse.de/tests/15099017#step/trup_smoke/41 * https://openqa.suse.de/tests/15282806#step/transactional_update/25 ## Reproducible Fails since (at least) Build [3.12](https://openqa.suse.de/tests/14827520) ## Expected result Last good: (unknown) (or more recent) ## Further details Always latest result in this scenario: [latest](https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle-micro&flavor=Base-VMware&machine=svirt-vmware70&test=cloud-init&version=6.1) ## Acceptance criteria * **AC1**: We know how the problematic situation can be fixed or worked around. ## Suggestions * Maybe the situation can be detected * Re-establish the VNC connection from test code or from testapi (taking care dewebsockify is also restarted as needed)