action #57281
closed[sle][Migration][SLE15SP2] test fails in orphaned_packages_check - switch to tty failed
100%
Description
Observation¶
Can't switch to tty after migration.
openQA test in scenario sle-15-SP2-Installer-DVD-x86_64-online_sles15_pscc_basesys+srv_def_full_y@64bit fails in
orphaned_packages_check
Test suite description¶
Reproducible¶
Fails since (at least) Build 6.2
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by hjluo about 5 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 20
the switch to tty failed.
Updated by hjluo about 5 years ago
this module just called select_root_console, but the tty was not showing at that time. And the desktop was not ready at that time, maybe the desktop was crash at that time, which we need to dig in.
Updated by hjluo about 5 years ago
Now this case was moved to s390x and passed.
https://openqa.suse.de/tests/3473156
Updated by hjluo about 5 years ago
- % Done changed from 20 to 30
Currently, this case blocked by bug bsc#1155180, we'll check once this bug was fixed to check the module orphaned_packages_check.
Updated by hjluo about 5 years ago
- % Done changed from 30 to 40
We hit this kind of issue in patch_sle, now need check if desktop was crashed or not and find way to fix it.
https://openqa.suse.de/tests/3561880
https://openqa.suse.de/tests/3561879
Updated by hjluo about 5 years ago
snapper_rollback
failed https://openqa.suse.de/tests/3554809
fixed https://openqa.suse.de/tests/3570051
Updated by hjluo about 5 years ago
Updated by okurz almost 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles15_media_basesys-srv-lgm-pcm_def_full
https://openqa.suse.de/tests/3598328
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by hjluo almost 5 years ago
- % Done changed from 40 to 50
this is actually an issue of switch_to_desktop and it passed with PR#8880
verify run https://openqa.suse.de/tests/3635304
Updated by hjluo almost 5 years ago
- % Done changed from 50 to 70
Another one in build 93.1 and passed with fix PR#8881
https://openqa.suse.de/tests/3627740 => https://openqa.suse.de/tests/3635337
Updated by okurz almost 5 years ago
I checked the latest referenced failure https://openqa.suse.de/tests/3627740#step/patch_sle/105 and what I see there is that the check after switch to tty6 times out after 60s. Did you check if maybe the X server itself runs on tty6, 60s should be well enough to switch to the display but you can also change the timeout or scale it with SCALE_TIMEOUT for checking.
Updated by hjluo almost 5 years ago
For https://openqa.suse.de/tests/3627740#step/patch_sle/105, we can't verify it now cause the 93.1 iso file was deleted. I'll close that PR to use SCALE_TIMEOUT for this kind of issue.
Updated by hjluo almost 5 years ago
- Status changed from In Progress to Resolved
- % Done changed from 70 to 100
resolved this ticket as we'll use TIMEOUT_SCALE for this kind of error.
Updated by okurz almost 5 years ago
Please do not use "TIMEOUT_SCALE" for production tests, only for debugging or crosschecking or in case of really slow machines which we do not use in production.
Updated by leli almost 5 years ago
- Status changed from Resolved to In Progress
- % Done changed from 100 to 0
Re-open it since the issue is not resolved yet.
Updated by coolgw almost 5 years ago
If you check the success verification log you will find switch console action complete within one second, means osd env recover by itself after rerun, not related with
PR(https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8881) which enlarge the timeout.
I guess two situation can trigger this issue
1) Something wrong within linux(crash happen? X-windows freeze?)
2) The ctl+Fx key from os-autoinst lost
Currently my proposal for this issue are:
1) Submit an PR for collect more log (means we should open more debug msg and upload more log)
2) Try send ctl+Fx key more times, to see situation become good or not
Updated by hjluo almost 5 years ago
Updated by hjluo almost 5 years ago
- % Done changed from 0 to 10
We discussed this issue and agreed that this is a random issue, we migration can just check the tty6 to see if it can
be switched, if now we can send it again and error out after 3 times try.
Updated by hjluo almost 5 years ago
the call path is like:
activate_console ->my @tags = ("tty$nr-selected", "text-logged-in-$user");
[2019-12-23T08:31:09.800 CET] [debug] MMM -> patch_sle:wait_boot
[2019-12-23T08:31:09.800 CET] [debug] MMM -> opensusebasetest:wait_boot
[2019-12-23T08:31:23.730 CET] [debug] MMM ->opensusebasetest:wait_boot_past_bootloader
[2019-12-23T08:32:16.643 CET] [debug] MMM -> into the desktop
[2019-12-23T09:50:09.566 CET] [debug] MMMM ->activate_console
[2019-12-23T09:50:09.566 CET] [debug] activate_console, console: root-console, type: console
[2019-12-23T09:50:09.566 CET] [debug] MMM ->NNNNN call self->hyperv_console_switch(root-console, 6)
[2019-12-23T09:50:09.566 CET] [debug] MMM ->hyperv_console_switch
[2019-12-23T09:50:09.566 CET] [debug] /var/lib/openqa/share/tests/sle/tests/update/patch_sle.pm:75 called migration::setup_sle
[2019-12-23T09:50:09.566 CET] [debug] <<< testapi::wait_still_screen(similarity_level=47, stilltime=5, timeout=30)
[2019-12-23T09:50:15.097 CET] [debug] >>> testapi::wait_still_screen: detected same image for 5 seconds, last detected similarity is 50.
3353530196853
[2019-12-23T09:50:15.098 CET] [debug] /var/lib/openqa/share/tests/sle/tests/update/patch_sle.pm:75 called migration::setup_sle
[2019-12-23T09:50:15.098 CET] [debug] <<< testapi::check_screen(mustmatch=[
'tty6-selected',
'text-logged-in-root'
], timeout=60)
[2019-12-23T09:50:15.293 CET] [debug] >>> testapi::_handle_found_needle: found text-login-20180416, similarity 1.00 @ 715/34
[2019-12-23T09:50:15.293 CET] [debug] MMM -> VVVV switch tty_6 successed!
Updated by hjluo almost 5 years ago
Hi Oliver,
Now in build 108.1, we didn't hit this issue and for further investigation, we'd add some debug info in osinst's
query_isotovideo to see what's happening when we can't switch tty. do you have any ideas on how to fix this ticket?
Thanks!
Updated by hjluo almost 5 years ago
another try on 44 box: http://10.161.8.44/tests/1031
Updated by okurz almost 5 years ago
- Related to action #48110: [functional][u][sporadic] test failed in different modules that switch from textmode terminal to graphical terminal - unable to login into the gnome session again but we should not even need to login when selecting the correct tty added
Updated by okurz almost 5 years ago
- Related to action #41237: [functional][u][ipmi] test fails in first_boot after system shows text tty login prompt but fails to connect to machine over SSH -> need better post_fail_hook or retry, compare to s390x approach added
Updated by okurz almost 5 years ago
- Related to action #58505: [functional][y][timboxed: 12h] Make console switching working for hyper-v backend in the installer added
Updated by okurz almost 5 years ago
- Related to action #55115: [qe-core][functional] test fails in sssd - Test fails switching to serial terminal added
Updated by okurz almost 5 years ago
- Related to action #53720: [SLE][Migration][backlog] test fails in patch_sle - switch to console failed added
Updated by okurz almost 5 years ago
- Related to action #34471: [qe-core][functional][opensuse][medium] too early matching in too generic needle text-login-20160812 added
Updated by okurz almost 5 years ago
- https://github.com/os-autoinst/os-autoinst-distri-opensuse/commit/d13647566e5b095b9dc72cb5cc1b0056afdeaaa1#diff-a068d8ac3af290672e4e5a612f1be4e5 overrides the base class post_fail_hook hence there is no check anymore for system responsiveness as is ensured by lib/opensusebasetest . I suggest to call
$self->SUPER::post_fail_hook
in the post_fail_hook to ensure these checks are done as well. - The test module "orphaned_packages_check" is not a good candidate to check for a properly logged in console. I think a better idea would be to call migration/sle12_online_migration/post_migration before console/orphaned_packages_check
- There should be no need to change os-autoinst as basically the only thing that os-autoinst does is send the hotkey, the check for the right screen after switching is done within https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/susedistribution.pm#L788 so you can simply change test code there to handle the failed detection
- Please also see all the tickets I linked to the current one
Updated by tinawang123 almost 5 years ago
Updated by hjluo almost 5 years ago
Hi Oliver,
Thanks for the suggest fix, we'll try and see how it works.
Huajian.Luo
Updated by hjluo almost 5 years ago
Updated by hjluo almost 5 years ago
with the PR https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9270 which moves orphaned_package_check to regression test, we don't need to load it during online migration test.
So we'd like to close this ticket by now and for further switch tty issues, we can file with the following ticket. https://progress.opensuse.org/issues/48110
Updated by hjluo almost 5 years ago
- Status changed from In Progress to Resolved
- % Done changed from 40 to 100
close this ticket and will reopen it if it still reproducible in the regression tests.
Updated by okurz almost 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles15sp1_media_lp-we-basesys-srv-desk-dev-contm-lgm-py2-tsm-wsm_all_full
https://openqa.suse.de/tests/3853733
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz almost 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles15sp1_media_lp-we-basesys-srv-desk-dev-contm-lgm-py2-tsm-wsm_all_full
https://openqa.suse.de/tests/3900697
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed