action #35589
closedcoordination #35302: [qe-core][opensuse][functional][epic][sporadic] Various unstable tests on o3
[functional][u][opensuse][sporadic][medium] test fails in kontact - needs workaround for boo#1105207, then akregator not closed
Added by JERiveraMoya over 6 years ago. Updated almost 5 years ago.
0%
Description
Observation¶
openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-kde-wayland@64bit_virtio fails in
kontact
After akregator is open, apparently after running akonadictl start
before finishing the test akregator needs to be closed to be able to match needles intended for match that app is closed and desktop is visible.
Reproducible¶
Fails since (at least) Build 20180420
Expected result¶
Last good: 20180419 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by mloviska over 6 years ago
Similar issue
https://openqa.opensuse.org/tests/664692#step/kontact/21
Amarok was not closed by previous test.
Updated by okurz over 6 years ago
- Subject changed from [opensuse][functional] test fails in kontact - akregator not closed to [opensuse][u][functional] test fails in kontact - akregator not closed
- Due date set to 2018-06-05
- Target version set to Milestone 16
Updated by okurz over 6 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: kde-wayland@64bit_virtio-2G
https://openqa.opensuse.org/tests/676017
Updated by SLindoMansilla over 6 years ago
- Subject changed from [opensuse][u][functional] test fails in kontact - akregator not closed to [opensuse][u][functional][sporadic][medium] test fails in kontact - akregator not closed
We can see the command not typed entirely. Maybe the command runner is not ready when the test starts typing (mising keys?).
Updated by SLindoMansilla over 6 years ago
- Due date changed from 2018-06-05 to 2018-06-19
Not enough capacity during sprint 18
Updated by okurz over 6 years ago
- Due date set to 2018-08-14
- Target version changed from Milestone 16 to Milestone 18
Updated by mloviska over 6 years ago
From the video it seems like desktop runner is ready to accept whole input. As the script types first characters "ak" runner gives suggestions pointing to amarok. Somehow the first characters get deleted therefore we see incomplete input.
Updated by okurz over 6 years ago
- Target version changed from Milestone 18 to Milestone 18
Updated by okurz over 6 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: kde-wayland@64bit_virtio-2G
https://openqa.opensuse.org/tests/677523
Updated by okurz over 6 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: kde-wayland@64bit_virtio
https://openqa.opensuse.org/tests/712629
Updated by zluo over 6 years ago
- Status changed from Workable to In Progress
- Assignee set to zluo
take over
Updated by zluo over 6 years ago
filed https://progress.opensuse.org/issues/38864 for compilation issue.
Updated by zluo over 6 years ago
I see a ticket blocker: https://progress.opensuse.org/issues/37501
Updated by zluo over 6 years ago
http://e13.suse.de/tests/7120#step/desktop_runner :(
desktop runner doesn't start up.
Updated by okurz over 6 years ago
Your customized test scheduled can not work this way because x11/kontact does not call select_console('x11')
so you are stuck on the terminal that consoletest_setup selected. Either schedule another module in between, e.g. consoletest_finish, don't schedule consoletest_setup or put the call to select_console('x11')
temporarily into the test module kontact
Updated by zluo over 6 years ago
at moment I cannot reproduce this issue with akregator. It seems to not started at all in my test runs:
x11_start_program('akonadictl start', valid => 0);
# Workaround: sometimes the account assistant behind of mainwindow or tips window
# To disable it run at first time start
x11_start_program("echo \"[General]\" >> ~/.kde4/share/config/kmail2rc", valid => 0);
x11_start_program("echo \"first-start=false\" >> ~/.kde4/share/config/kmail2rc", valid => 0);
x11_start_program("echo \"[General]\" >> ~/.config/kmail2rc", valid => 0);
x11_start_program("echo \"first-start=false\" >> ~/.config/kmail2rc", valid => 0);
There is no checks for status, and video doesn't show akregator at all.
Updated by okurz over 6 years ago
- Status changed from In Progress to Blocked
- Assignee changed from zluo to okurz
so it seems there is some confusion. "akgregator" is the news reader that is visible in https://openqa.opensuse.org/tests/662767#step/kontact/2 . Where this comes from I do not really know. It might be related to akonadi starting but akonadi != akregator, but a background server.
For now we are blocked by bsc#1102832 . Let's see how it looks afterwards.
Updated by okurz over 6 years ago
- Due date changed from 2018-08-14 to 2018-08-28
- Status changed from Blocked to Workable
- Assignee deleted (
okurz)
back to original problem and workable: https://openqa.opensuse.org/tests/723149#step/kontact/2
Updated by jorauch over 6 years ago
- Assignee deleted (
jorauch)
No real progress here, therefore unassigning.
Could not reproduce it and since it's a sporadic start of an unwanted program I think this is more of a product issue
Updated by oorlov over 6 years ago
- Assignee set to oorlov
As I can see there are missing keys when x11_start_program
function called. Sometimes it writes 'nadictl start', sometimes 'ona', etc. All of them are part of 'akonadictl start'. The question is why KDE opens 'Akregator' by all that keywords.
- I'll try to reproduce the issue locally with MAKETESTSNAPSHOTS=1 parameter to be able to connect to the failed module quickly;
- System might be unresponsible for some time due to high load, caused by some process. So, I'll try to gather logs on what is running at the moment and how high CPU and memory usage are;
- It might be possible, that previous module affects that one (amarok.pm), as only the first command after Amarok closing is affected. All further commands are written without key missing. So I'll try to check different combinations.
Updated by oorlov over 6 years ago
I've tried to reproduce the issue locally, but the test failed on 'start_wayland_plasma5' module.
The appropriate bcs ticket is marked as RESOLVED FIXED on 13.08.2018, but I've found out that the issue is still happening on o3 on 15.08.2018 and also I've reproduced it locally with the latest build.
So, I've asked fvogt if the fix already applied in the last build, but not he didn't reply me yet.
Updated by mgriessmeier about 6 years ago
- Due date changed from 2018-08-28 to 2018-09-11
Updated by oorlov about 6 years ago
A fix 'start_wayland_plasma5' module was applied in https://bugzilla.opensuse.org/show_bug.cgi?id=1105798.
So, after the fix will appear on o3, it may be possible to reproduce the issue with 'kontact' module.
Updated by mgriessmeier about 6 years ago
- Due date changed from 2018-09-11 to 2018-09-25
Updated by oorlov about 6 years ago
Fix for 'start_wayland_plasma5' is on o3, so I'm in progress of investigating the 'kontakt' issue.
Updated by SLindoMansilla about 6 years ago
- Due date changed from 2018-09-25 to 2018-10-09
Moving to sprint 27. Not able to finish during 26.
Updated by okurz about 6 years ago
- Target version changed from Milestone 18 to Milestone 19
Updated by oorlov about 6 years ago
- Status changed from In Progress to Workable
- Assignee deleted (
oorlov)
I was not able to reproduce the issue locally, as modules that are before the 'kontakt' failed on test runs on my machine. e.g. http://10.160.65.138/tests/421, http://10.160.65.138/tests/422, http://10.160.65.138/tests/423.
The issue is hard to catch, as a lot of modules should pass before the 'kontakt' module is executed. So, could you please consider this before re-estimating the ticket.
Updated by okurz about 6 years ago
- Due date changed from 2018-10-09 to 2018-10-23
- Target version changed from Milestone 19 to Milestone 20
I suggest to call just boot_to_desktop
and then kontact
to ensure that kontact
itself is not the problem. And then schedule boot_to_desktop
, amarok
, kontact
. It makes sense to work on this in the next sprint.
Updated by zluo about 6 years ago
- Status changed from Workable to In Progress
to check latest job for this scenario and clone the job locally at first.
Updated by zluo about 6 years ago
use assert_and_click, create a new needle for the fix:
Updated by zluo about 6 years ago
Updated by zluo about 6 years ago
- Status changed from In Progress to Blocked
kontact failed on 3o because of a fatal error:
https://bugzilla.opensuse.org/show_bug.cgi?id=1111606
So this issue blocks verification of test runs on 3o.
Set it as blocked for now.
Updated by okurz about 6 years ago
- Due date deleted (
2018-10-23) - Target version changed from Milestone 20 to Milestone 21
Yes, good point. No need to invest time on something that is blocked by a bug. When we are done with all other tasks we can still revisit here ;) I closed your bug as a duplicate of the same report I created already earlier on openSUSE Krypton: https://bugzilla.suse.com/show_bug.cgi?id=1105207
Updated by okurz about 6 years ago
- Has duplicate action #42830: [functional][u] test fails in kontact - akregator was launched at the first place added
Updated by okurz about 6 years ago
- Subject changed from [opensuse][u][functional][sporadic][medium] test fails in kontact - akregator not closed to [opensuse][u][functional][sporadic][medium] test fails in kontact - needs workaround for boo#1105207, then akregator not closed
- Status changed from Blocked to Workable
- Assignee deleted (
zluo)
the bug seems to not move forward, we should invest into a workaround, one can click the ok button of the error message with a soft fail and continue
Updated by zluo about 6 years ago
- Status changed from Workable to In Progress
- Assignee set to zluo
take over and check this for a workaround.
Updated by zluo about 6 years ago
need a test run against latest build and create a needle for close dialog with fatal error:
Updated by zluo about 6 years ago
Updated by mgriessmeier about 6 years ago
- Status changed from In Progress to Feedback
Updated by zluo about 6 years ago
- Status changed from Feedback to In Progress
http://e13.suse.de/tests/10470 shows fatal error, so trying now to use workaround and hope to get verified test run.
Updated by zluo about 6 years ago
http://e13.suse.de/tests/10533#step/kontact/16 shows the example where x11_start_program('killall kontact', valid => 0); doesn't work.
Updated by zluo about 6 years ago
http://e13.suse.de/tests/10533#step/kontact/16 shows the example: after killall kontact, akregator is still there
Now trying with:
record_soft_failure('akregator cannot be closed, related to issue of bsc#1105207') && return if (check_screen 'akregator-not-closed');
Updated by zluo about 6 years ago
Updated by zluo about 6 years ago
http://e13.suse.de/tests/10638#step/kontact/11 shows performance issue, this is really weird with tying character, it starts Amarok.
Updated by zluo about 6 years ago
QEMURAM 1536 is not so much to run the whole tests...
Updated by zluo almost 6 years ago
https://openqa.opensuse.org/tests/805158#step/kontact/15 shows fatal error, so this problem is not sporadic.
Updated by zluo almost 6 years ago
- Status changed from Feedback to In Progress
from okurz:
I don't see how akregator is related to https://bugzilla.suse.com/show_bug.cgi?id=1105207 . Can we please split the concerns and you just focus on the kontact error and not akregator?
For easier testing have you considered https://progress.opensuse.org/issues/35589#note-35 to just schedule booting from an image and call kontact? If you not schedule "amarok" it should prevent the "akregator" problem
--
checking...
Updated by okurz almost 6 years ago
- Priority changed from Normal to High
- Target version changed from Milestone 21 to Milestone 22
median cycle time exceeded -> bumping prio and target version to current milestone
Updated by zluo almost 6 years ago
- Status changed from Feedback to In Progress
working on this again and need to provide new verification run because the old results are gone after I re-installed my workstation.
Updated by okurz almost 6 years ago
PR merged. Now we can focus again on the original issue of akregator showing up instead of kontact.
Updated by zluo almost 6 years ago
- Status changed from Feedback to In Progress
check this again. I think akregator (news feed) needs to be close separately because we can start /usr/bin/akregator.
The original issue shows akregator stays opened however kontact is already gone.
Updated by zluo almost 6 years ago
zluo@f40:/var/lib/openqa/tests/opensuse/tests> cnf akregator
The program 'akregator' can be found in the following package:
* akregator [ path: /usr/bin/akregator, repository: zypp (download.opensuse.org-oss) ]
Try installing with:
sudo zypper install akregator
zluo@f40:/var/lib/openqa/tests/opensuse/tests> cnf kontact
The program 'kontact' can be found in the following package:
* kontact [ path: /usr/bin/kontact, repository: zypp (download.opensuse.org-oss) ]
Try installing with:
sudo zypper install kontact
Updated by okurz almost 6 years ago
Not sure what you want to tell with the latest two comments however the original problem was, as described in #35589#note-20 that akregator is started by a partially typed or mistyped command in the desktop runner. Immediately after the start of the test module "kontact" the test starts to type "a" for "akonadictl" in https://openqa.opensuse.org/tests/832330/file/video.ogv#t=306.21,306.25 for which krunner suggests amarok which then somehow consumes the next character "k" and the typing continues with "o", see https://openqa.opensuse.org/tests/832330/file/video.ogv#t=306.41,306.45 so it can only turn out wrong.
Updated by zluo almost 6 years ago
well, this kind of typing issue happens sometimes and we don't have a good idea to fix this which is more related to setup/performance. My tests runs don't show this issue. So it is very hard to reproduce.
Updated by zluo almost 6 years ago
of course we should try to make test module more robust and provide workaround if needed. In this case with Akonadi service, if akondadi service is not running, starting kontact should work anyway. If akregator got started wrongly by typing issue, then we cannot fix this, but try to check this error and provide softfail for that.
This is already provided by my PR.
Updated by okurz almost 6 years ago
zluo wrote:
If akregator got started wrongly by typing issue, then we cannot fix this
I am sure we can fix it. If this is only shown by our automated tests then we can either ensure that this is handled as a valid bug and fixed accordingly (I am not aware of any bug report) or at least provide a workaround if we can not find a fix.
If you can not reproduce it locally you can try out https://progress.opensuse.org/issues/44327 and trigger tests on production based on your local branch.
zluo wrote:
This is already provided by my PR.
Why PR? https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6226 ? It only aborts the test early on boo#1105207
If you do not have further plans for this ticket yourself it's ok to unassign.
https://openqa.opensuse.org/tests/835854#step/kontact/8 is one of the latest failures. The screenshot here directly shows the incorrectly typed string in krunner.
Updated by zluo almost 6 years ago
- Assignee deleted (
zluo)
well, typing issue is a general issue on osd. I won't be able to fix this. Please someone else can take over...
Updated by okurz almost 6 years ago
- Blocks action #41540: [functional][u][sporadic] test fails in kontact as command "killall" is mistyped in x11_start_program (seems plasma specific problem) added
Updated by okurz almost 6 years ago
- Status changed from Workable to In Progress
- Assignee set to okurz
Taking latest extratests_in_kde and triggering a custom schedule for gathering current fail rate:
env openqa-clone-set https://openqa.opensuse.org/tests/838225 poo35589_okurz_kde_wayland SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/amarok,tests/x11/kontact
https://openqa.opensuse.org/tests/overview?build=poo35589_okurz_kde_wayland
and in parallel to try the "security mitigation off"-switches mentioned in https://bugzilla.opensuse.org/show_bug.cgi?id=1117833#c31
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 838225 TEST=poo35589_okurz_kde_wayland_mitigation_off_001 BUILD=poo35589_okurz_kde_wayland_mitigation_off _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/amarok,tests/x11/kontact EXTRABOOTPARAMS="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off"
-> Created job #839521: opensuse-Tumbleweed-DVD-x86_64-Build20190124-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t839521
and latest retrigger based on rebased code: https://openqa.opensuse.org/tests/843506#step/start_wayland_plasma5/52
Fails to login in a stable way. It looks like sddm crashes the session on login (seen in the video) but also #46223 is impacting us as the "focussed" password prompt is not correctly detected. Blocked by #46223 and waiting for the results from #39926 as this is also about wayland.
Updated by okurz almost 6 years ago
- Blocked by action #46223: [functional][u] test fails in user_gui_login - fails to re-login, password not typed or entry field not focussed? added
Updated by okurz almost 6 years ago
Hm, maybe I did something wrong on cloning because of not configuring the VM for virtio.
First a verification for the helper commits in https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6709 with
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 845661 TEST=poo35589_okurz_kde_wayland_mitigation_off_001 BUILD=poo35589_okurz_kde_wayland_mitigation_off _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop EXTRABOOTPARAMS_BOOT_LOCAL="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off" _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1
Created job #846236: opensuse-Tumbleweed-DVD-x86_64-Build20190202-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t846236
Looks good. Now, onto loading the right modules again:
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 845661 TEST=poo35589_okurz_kde_wayland_mitigation_off_001 BUILD=poo35589_okurz_kde_wayland_mitigation_off _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/amarok,tests/x11/kontact EXTRABOOTPARAMS_BOOT_LOCAL="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off" _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1
Created job #846305: opensuse-Tumbleweed-DVD-x86_64-Build20190202-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t846305
failed in amarok. Maybe it's better to start tests first with kontact only?
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 845661 TEST=poo35589_okurz_kde_wayland_mitigation_off_$i BUILD=poo35589_okurz_kde_wayland_mitigation_off _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact EXTRABOOTPARAMS_BOOT_LOCAL="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off" _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1 ; done
and for crosschecking with mitigation on:
$ for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 845661 TEST=poo35589_okurz_kde_wayland_kontact_only_$i BUILD=poo35589_okurz_kde_wayland_kontact_only _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1 ; done
Comparing failure rate in both and see if 1) mitigation off makes a difference 2) kontact is stable or fails even if it's the only module (might need krunner module for settle-down anyway).
Result¶
https://openqa.opensuse.org/tests/overview?distri=opensuse&version=Tumbleweed&build=poo35589_okurz_kde_wayland_mitigation_off failed in 17/100 jobs vs. https://openqa.opensuse.org/tests/overview?version=Tumbleweed&distri=opensuse&build=poo35589_okurz_kde_wayland_kontact_only failed in 25/100 jobs. Let's schedule 2x100 more for better statistics. I have the assumption though that the mitigation off helps however not all.
EDIT: That's now 39/200 -> 19.5% failure rate for mitigation off, 44/200 -> 22% failure rate for mitigation on -> no significant difference
Next steps¶
sysrich mentioned a potential relation to boo#1112824 because the Tumbleweed kernel as preemption enforced whereas on older versions and SLE and Leap we use voluntary preemption (or off? Something along those lines). Would be worth a try to run "kernel-vanilla" but I am not sure if we have a "test module" ensuring the installation of that package. Next step after that would be to rebuild the kernel with the changed setting and trying that one.
EDIT: We do have a test module to change the kernel, tests/kernel/change_kernel.pm
for i in {001..001}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 845661 TEST=poo35589_okurz_kde_wayland_mitigation_offkernel-vanilla_$i BUILD=poo35589_okurz_kde_wayland_mitigation_off_kernel-vanilla _GROUP=0 SCHEDULE=tests/kernel/change_kernel,tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact EXTRABOOTPARAMS_BOOT_LOCAL="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off" _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1 CHANGE_KERNEL_REPO=https://download.opensuse.org/repositories/Kernel:/stable/standard/ CHANGE_KERNEL_PKG=kernel-vanilla; done
Created job #846861: opensuse-Tumbleweed-DVD-x86_64-Build20190202-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t846861 -> passed so this approach works. 12 minutes vs. 8 minutes runtime. Let's schedule more of these:
for i in {002..200}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 845661 TEST=poo35589_okurz_kde_wayland_mitigation_offkernel-vanilla_$i BUILD=poo35589_okurz_kde_wayland_mitigation_off_kernel-vanilla _GROUP=0 SCHEDULE=tests/kernel/change_kernel,tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact EXTRABOOTPARAMS_BOOT_LOCAL="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off" _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1 CHANGE_KERNEL_REPO=https://download.opensuse.org/repositories/Kernel:/stable/standard/ CHANGE_KERNEL_PKG=kernel-vanilla; done
EDIT: 40/200 failed -> 20% failure rate -> no significant difference, "kernel-default" is not worse than "kernel-vanilla"
How about the same as above with default kernel and no changes to mitigation with Leap 15.0:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 844165 TEST=poo35589_okurz_kde_wayland_kontact_only_leap151_$i BUILD=poo35589_okurz_kde_wayland_kontact_only_leap151 _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1 ; done
Updated by okurz almost 6 years ago
The last check on Leap15.1 did not work. All 100 jobs failed in https://openqa.opensuse.org/tests/847346#step/boot_to_desktop/5 to boot to a graphical desktop although the serial port shows that after 700s there are still some services responding with output. Currently no good idea what is wrong here.
Crosschecking as suggested in https://bugzilla.opensuse.org/show_bug.cgi?id=1112824#c143
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 847596 TEST=poo35589_okurz_kde_wayland_mitigation_off_kernel-twvolun_$i BUILD=poo35589_okurz_kde_wayland_mitigation_off_kernel-twvolun _GROUP=0 SCHEDULE=tests/kernel/change_kernel,tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact EXTRABOOTPARAMS_BOOT_LOCAL="nopti nospec nospectre_v2 nospec_store_bypass_disable spectre_v2_user=off" _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1 CHANGE_KERNEL_REPO=https://download.opensuse.org/repositories/home:/favogt:/twvolun/standard/; done
EDIT: 17/99 failed (1 incomplete) -> 17% failure rate -> no significant difference
TODO: try to test successfully on Leap, any version. I have the feeling that we are using krunner too much in kontact. We could make that more stable with more waiting or using e.g. the user-console although that is most likely not faster because of switching and console enabling and such. Even worse is probably xterm where we set the prompt and disable the serial terminal and such multiple times.
Updated by okurz almost 6 years ago
- Target version changed from Milestone 22 to Milestone 23
Updated by okurz over 5 years ago
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6942 created to improve robustness of start_wayland_plasma5 as a partial improvement of a side-failure.
So based on above results I come to the conclusion that we should either type even slower in krunner in the wayland scenario or use an xterm. However, for failures like https://openqa.opensuse.org/tests/848413#step/kontact/18 where we fail to type a single word correctly I doubt that xterm is really any better. So let's try to type slower in krunner@wayland.
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #866260: opensuse-Tumbleweed-DVD-x86_64-Build20190226-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t866260 -> https://openqa.opensuse.org/tests/overview?distri=opensuse&build=poo35589_okurz_kde_wayland_very_slow_krunner&version=Tumbleweed
Updated by okurz over 5 years ago
Still failed in some instances, e.g. https://openqa.opensuse.org/tests/866262#step/kontact/8 mentioning "ho" instead of "echo" so I added a wait_still_screen(3)
for the case of WAYLAND in init_desktop_runner
:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Updated by okurz over 5 years ago
Still some instances of incomplete strings. https://openqa.opensuse.org/tests/866980/file/video.ogv#t=27.30,27.31 shows that of the word "kontact" only "nt…" is showing up in the krunner dialog and the first two characters ended up with looks like yet another, previous krunner dialog. Trying with substrings.
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact _SKIP_POST_FAIL_HOOKS=1 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
6/99 failed, that's significantly less. However the failures in most cases look like still the same, e.g. some characters are typed in a krunner dialog, the next characters end up in a krunner dialog which pops up in a different location. I assume krunner is really crashing but nevertheless I guess we need a workaround.
To try one more time something different, typing one character, waiting, second character, waiting, typing rest plus collect logs from post_fail_hook:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
-> Created job #868058: opensuse-Tumbleweed-DVD-x86_64-Build20190226-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t868058 -> https://openqa.opensuse.org/tests/overview?version=Tumbleweed&build=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs&distri=opensuse
Updated by okurz over 5 years ago
That's funny, when we fail, we reproducibly fail to collect and upload logs as in https://openqa.opensuse.org/tests/868131#step/kontact/38 showing only the help for "ar" when it should be "tar …".
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #869814: opensuse-Tumbleweed-DVD-x86_64-Build20190226-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t869814 ->
https://openqa.opensuse.org/tests/overview?build=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs&distri=opensuse&version=Tumbleweed
Updated by okurz over 5 years ago
Still missing the first character of "tar" in https://openqa.opensuse.org/tests/868131#step/kontact/38 . Writing an explicit single whitespace in export_kde_logs.
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #869914: opensuse-Tumbleweed-DVD-x86_64-Build20190226-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t869914 -> https://openqa.opensuse.org/tests/overview?version=Tumbleweed&build=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace&distri=opensuse
Updated by okurz over 5 years ago
https://openqa.opensuse.org/tests/870044 has more logs now but could not find any "coredumps" or similar. The krunnerrc consists of:
[General]
history=kontact,echo "first-start=false" >> ~/.config/kmail2rc,echo "first-start=false" >> ~/.kde4/share/config/kmail2rc,ho "[General]" >> ~/.kde4/share/config/kmail2rc,ec,akonadictl start,xterm
[PlasmaRunnerManager]
LaunchCounts=1 services_xterm.desktop
so no "plugins" explicitly enabled but also interesting that the characters "ec" of the last "echo" are showing in the "history" variable, separated from "ho …"
Further ideas: Call check_desktop_runner
in start_wayland_plasma5 explicitly as well as an explicit test module for krunner which types the potentially problematic string "echo":
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace_plus_krunner_test _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #870056: opensuse-Tumbleweed-DVD-x86_64-Build20190226-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t870056 -> https://openqa.opensuse.org/tests/overview?build=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace_plus_krunner_test&distri=opensuse&version=Tumbleweed
Could not find anything obvious on logs. Of course, quite some messages that look like error messages but I am not sure which one would point to the problem I observe and I guess there is no use in me reporting openSUSE bugs about any of the specific issues. Could be upstream bugs however that's not my primary concern now.
On a VM running openSUSE Leap 15.1 staring "plasma (wayland)" I could not reproduce the problems manually but I also wanted to try out what happens when I try to crash krunner. Starting krunner in konsole and calling pkill -SIGSEGV krunner
causes the krash dialog with the frowning smiley in systray to show up and also console messages state "KCrash: Attemping to start /usr/bin/krunner from kdeinit" and "KCrash: Application 'krunner' crashing..." so an explicit message. Nothing I have seen in logs. What I have learned is that in the xsession-errors log, e.g. in https://openqa.opensuse.org/tests/870044/file/kontact-XSE.log the message "Using Wayland-EGL" should point to krunner starting initially. Whenever I open krunner in a VM manually I see another error message about an unexpected attribute on a virtual screen but none of that kind for the openQA VMs.
The web research results are very limited so far, best match https://opensuse.opensuse.narkive.com/OkKXNi2z/aargh-why-does-krunner-keep-disappearing suggesting to disable plugins. https://lists.opensuse.org/opensuse/2010-03/msg00039.html mentions how I could debug. If disabling plugins by config file does not work maybe I can delete some files
Trying to disable krunner plugins:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 864281 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace_plus_krunner_test_krunner_plugins_disabled _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
-> Created job #870840: opensuse-Tumbleweed-DVD-x86_64-Build20190226-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t870840 -> https://openqa.opensuse.org/tests/overview?version=Tumbleweed&build=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace_plus_krunner_test_krunner_plugins_disabled&distri=opensuse
Seems a lot of jobs are incomplete now because of missing assets, let's try based on more recent base job 870613:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace_plus_krunner_test_krunner_plugins_disabled _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #871054: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t871054 -> https://openqa.opensuse.org/tests/overview?build=poo35589_okurz_kde_wayland_very_slow_krunner_wait_still_screen_split_twice_with_logs_fixed_export_kde_logs_type_explicit_whitespace_plus_krunner_test_krunner_plugins_disabled&distri=opensuse&version=Tumbleweed
all incomplete well, not sure why.
A retriggered job https://openqa.opensuse.org/tests/8713700 works ok still so something with the code? Why no autoinst-log.txt then?
I have the suspicion that the build string is getting too long :D
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_slow_krunner_plugins_disabled _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
same fail ratio, still failing like in
https://openqa.opensuse.org/tests/871386/file/video.ogv#t=32.08,32.14
Only shell commands should be enabled but it looks like some software center suggestions are still there. I guess I need to recreate krunnerrc from a more recent base system, Trying in TW and Krypton VM.
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_slow_more_krunner_plugins_disabled _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #871511: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t871511 -> https://openqa.opensuse.org/tests/overview?build=poo35589_okurz_kde_wayland_slow_more_krunner_plugins_disabled&distri=opensuse&version=Tumbleweed
still failing to type "first-start" stuff not that stable
We should type less commands in kontact:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_slow_krunner_plugins_disabled_less_kontact_commands _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner2 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
-> Created job #871671: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t871671 -> https://openqa.opensuse.org/tests/overview?version=Tumbleweed&build=poo35589_okurz_kde_wayland_slow_krunner_plugins_disabled_less_kontact_commands&distri=opensuse
This reduced the fail ratio to 4/100 failed. In all four cases the problem is that the "echo" command is ineffective due to an incorrect string in krunner, e.g. as visible in https://openqa.opensuse.org/tests/871752/file/video.ogv#t=19.42,19.46 we see the incomplete string e "[Ge
instead of expected echo "[Ge
. Should we really type that command in the user console instead?
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_kontact_commands_user_console_skip_krunner _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner4 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #871811: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t871811 -> https://openqa.opensuse.org/tests/overview?distri=opensuse&build=poo35589_okurz_kde_wayland_kontact_commands_user_console_skip_krunner&version=Tumbleweed
Hm, still failing to type "kontact" or "killall" correctly, same fail rate.
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_krunner_plugins_disabled_user_console_match_typed _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner4 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #871951: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t871951 -> https://openqa.opensuse.org/tests/overview?distri=opensuse&build=poo35589_okurz_kde_wayland_krunner_plugins_disabled_user_console_match_typed&version=Tumbleweed
Added another 100 to the set. All 200 passed now \o/ That is 38fafc7c5 in fix/krunner4
Wanted to try out with the "match_typed" parameter but actually made the mistake to not actually commit it. So actually there should be no difference to before when it was last failing. Adding another 200 to the set.
I could try to enable more debug info for krunner, e.g. with kdebugsettings, but manually that does not bring so much more information.
As 400 (!) jobs pass now I want to crosscheck if I need all the changes, partially reverting, e.g. no string splitting, no krunner module, no wait_still_screen
:
for i in {001..40}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_user_console_rest_reverted _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner5 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #872359: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t872359 -> https://openqa.opensuse.org/tests/overview?version=Tumbleweed&distri=opensuse&build=poo35589_okurz_kde_wayland_user_console_rest_reverted
1/40 failed. Keeping the string splitting + wait_still_screen:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 870613 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_user_console_rest_reverted2 _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner5 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #872830: opensuse-Tumbleweed-DVD-x86_64-Build20190305-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t872830 -> https://openqa.opensuse.org/tests/overview?version=Tumbleweed&distri=opensuse&build=poo35589_okurz_kde_wayland_user_console_rest_reverted2
Still 4/41 failed. I should stay on fix/krunner4 for now
How does the the current "kde-wayland" scenario fare based on fix/krunner4:
for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 872655 TEST=poo35589_okurz_kde_wayland_base_$i BUILD=poo35589_okurz_kde_wayland_base_fix_krunner4 _GROUP=0 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner4; done
Some additional fails in firefox_audio but it's hard to see as the modules "ooffice, oocalc, oomath" always fail. I should exclude them. Also, should run these longer scenarios with reduced prio:
build=poo35589_okurz_kde_wayland_base_fix_krunner4; for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 872655 TEST=poo35589_okurz_kde_wayland_base_$i BUILD=$build _GROUP=0 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner4 _SKIP_POST_FAIL_HOOKS=1 EXCLUDE_MODULES=ooffice,oomath,oocalc; done ; for i in $(openqa_client_o3 --json-output jobs build=$build state=scheduled | jq '.jobs | .[] | .id') ; do openqa_client_o3 jobs/$i put --json-data '{"priority": 90}'; done
Updated by okurz over 5 years ago
- Status changed from In Progress to Feedback
Waiting for review and merge of https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6943 using the branch "fix/krunner2".
However, latest test results based on fix/krunner4 show that all that is not enough, many failures in kontact and other modules. I should probably try to create a proper PR based on fix/krunner4 as well and schedule with the schedule updates, e.g. including the krunner module and disabling plugins:
build=poo35589_okurz_kde_wayland_base_fix_krunner5; for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 877711 TEST=poo35589_okurz_kde_wayland_base_$i BUILD=$build _GROUP=0 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner5 _SKIP_POST_FAIL_HOOKS=1 EXCLUDE_MODULES=ooffice,oomath,oocalc PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse NEEDLES_DIR=/var/lib/openqa/cache/openqa1-opensuse/tests/opensuse/products/opensuse/needles ; done ; for i in $(openqa_client_o3 --json-output jobs build=$build state=scheduled | jq '.jobs | .[] | .id') ; do openqa_client_o3 jobs/$i put --json-data '{"priority": 90}'; done
Updated by okurz over 5 years ago
- Status changed from Feedback to In Progress
First part merged. After discussing with mgriessmeier and favogt I have another tiny idea: Checking that the desktop runner is there after every single character:
for i in {001..001}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 880149 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_check_border _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/krunner,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner6 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
-> https://openqa.opensuse.org/tests/880166
This seems to break krunner completely.
for i in {001..001}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 880149 TEST=poo35589_okurz_kde_wayland_$i BUILD=poo35589_okurz_kde_wayland_check_border _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner6 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
-> https://openqa.opensuse.org/tests/880219
This looks promising, no fail, easy to understand screenshots, single screenshot for every character. Not necessarily what we want in all cases but let's see if this helps for debugging or as fix :)
let's try a combined small and big set:
build=poo35589_okurz_kde_wayland_check_border; for i in {002..200}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 880149 TEST=poo35589_okurz_kde_wayland_$i BUILD=$build _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner6 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done ; for i in $(openqa_client_o3 --json-output jobs build=$build state=scheduled | jq '.jobs | .[] | .id') ; do openqa_client_o3 jobs/$i put --json-data '{"priority": 90}'; done; build=poo35589_okurz_kde_wayland_check_border_krunner6 ; for i in {001..100}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 880149 TEST=poo35589_okurz_kde_wayland_base_$i BUILD=$build _GROUP=0 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner6 _SKIP_POST_FAIL_HOOKS=1 EXCLUDE_MODULES=ooffice,oomath,oocalc PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse NEEDLES_DIR=/var/lib/openqa/cache/openqa1-opensuse/tests/opensuse/products/opensuse/needles ; done ; for i in $(openqa_client_o3 --json-output jobs build=$build state=scheduled | jq '.jobs | .[] | .id') ; do openqa_client_o3 jobs/$i put --json-data '{"priority": 90}'; done
Created job #880235: opensuse-Tumbleweed-DVD-x86_64-Build20190313-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t880235 -> https://openqa.opensuse.org/tests/overview?distri=opensuse&build=poo35589_okurz_kde_wayland_check_border&version=Tumbleweed
and https://openqa.opensuse.org/tests/overview?build=poo35589_okurz_kde_wayland_check_border_krunner6&distri=opensuse&version=Tumbleweed -> is not what I wanted because it should have been the original kde-wayland scenario, not extra_tests_in_kde. However, it still gives valuable information about the (in-)stability of mainly gnucash failing in 38/100 scenarios. Created tickets for chrome #49361, wine #49358, gnucash #49355
Back to the previous problem: https://openqa.opensuse.org/tests/880241#step/kontact/18 shows the characters "ko" of "kontact" to be typed correctly, the following screen https://openqa.opensuse.org/tests/880241#step/kontact/19 shows "n" but with the two previous characters lost. So krunner vanished in between and somehow reappeared. Anyone has an idea?
Updated by okurz over 5 years ago
- Status changed from In Progress to Feedback
I asked in #opensuse-kde if anyone has an idea as well. I am a bit out of ideas :)
EDIT: Suggestion from fvogt: With export QT_LOGGING_RULES=*.debug=true it'll log every single event, but it might also slow it down enough that it works - still worth a try
Updated by okurz over 5 years ago
- Priority changed from High to Normal
[15/03/2019 15:54:40] <okurz> DimStar, coolo: I haven't realized so far that oocalc/ooffice looks better, https://openqa.opensuse.org/tests/880736 kde-wayland@virtio is soft-failed. Not seen that often ;) I hope I could improve "kontact" stability recently as well
Updated by okurz over 5 years ago
- Target version changed from Milestone 23 to Milestone 25
build=poo35589_okurz_kde_wayland_qt_log; for i in {001..001}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 881292 TEST=poo35589_okurz_kde_wayland_$i BUILD=$build _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#fix/krunner6 MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #883005: opensuse-Tumbleweed-DVD-x86_64-Build20190315-extra_tests_on_kde@64bit -> https://openqa.opensuse.org/t883005 -> https://openqa.opensuse.org/tests/overview?distri=opensuse&build=poo35589_okurz_kde_wayland_qt_log&version=Tumbleweed
[18/03/2019 17:22:45] <okurz> fvogt: so in https://openqa.opensuse.org/tests/883227#step/kontact/52 you can see that we type "k" of "killall" in krunner before krunner vanishes. https://openqa.opensuse.org/tests/883227/file/kontact-XSE.log is the complete logfile that should include everything from QT_LOGGING_RULES="*.debug=true". Maybe you can find something?
[18/03/2019 17:23:08] <fvogt> okurz: Yay, I'll have a quick look now and a deeper look later
then:
- fvogt When the "i" is being types, the window state changes to hidden and then after a while there's a sudden size change from 532x647 to 532x47 The only log message during the window state transition is "org.kde.kactivities.lib.core: Killing the consumer". Whatever that means... that's at least as much logging as we can get out of krunner. WAYLAND_DEBUG=1 would produce even more, but we already know that the window state changes. Meh, the message is nothing of any value, just a destructor: https://github.com/KDE/kactivities/blob/0d6245995b43f8b1927b0e0c52859b5ebb3c2e19/src/lib/consumer.cpp#L64
- okurz So just loosing focus of the window?
- fvogt Probably, but there's nothing about that either. It would explain why it disappears and reappears without content though
Updated by ggardet_arm over 5 years ago
PR reverted due to https://progress.opensuse.org/issues/49406
Updated by okurz over 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: kde-wayland
https://openqa.opensuse.org/tests/907714
Updated by okurz over 5 years ago
- Assignee changed from okurz to mgriessmeier
Move to new QSF-u PO after I moved to the "tools"-team. I mainly checked the subject line so in individual instances you might not agree to take it over completely into QSF-u. Feel free to discuss with me or reassign to me or someone else in this case. Thanks.
Updated by okurz over 5 years ago
- Status changed from Feedback to Resolved
- Assignee changed from mgriessmeier to okurz
I think this is fixed. I checked the history of kde@64bit as well as kde-wayland@virtio and I see no failures in kontact since 3 months
Updated by okurz over 5 years ago
- Status changed from Resolved to Feedback
seems some people still have problems with whatever is the current approach, let's track this further.
Updated by okurz over 5 years ago
- Related to action #51944: [opensuse][functional][u] test fails in dolphin -- "kdialog --getopenfilename" fails to start added
Updated by okurz over 5 years ago
- Related to action #53045: [opensuse][kde][sporadic] krunner suggestions check is racy added
Updated by okurz over 5 years ago
Followup to PR by others:
build=poo35589_okurz_kde_wayland_qt_log; for i in {001..001}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 883005 TEST=poo35589_okurz_kde_wayland_$i BUILD=$build _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#enhance/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #963260: opensuse-Tumbleweed-DVD-x86_64-Buildpoo35589_okurz_kde_wayland_qt_log-poo35589_okurz_kde_wayland_001@64bit_virtio -> https://openqa.opensuse.org/t963260
Updated by mgriessmeier over 5 years ago
- Target version changed from Milestone 25 to Milestone 26
okurz wrote:
Followup to PR by others:
build=poo35589_okurz_kde_wayland_qt_log; for i in {001..001}; do openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org 883005 TEST=poo35589_okurz_kde_wayland_$i BUILD=$build _GROUP=0 SCHEDULE=tests/boot/boot_to_desktop,tests/x11/start_wayland_plasma5,tests/x11/krunner,tests/x11/kontact CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse#enhance/krunner MACHINE=64bit_virtio QEMUVGA=virtio WAYLAND=1; done
Created job #963260: opensuse-Tumbleweed-DVD-x86_64-Buildpoo35589_okurz_kde_wayland_qt_log-poo35589_okurz_kde_wayland_001@64bit_virtio -> https://openqa.opensuse.org/t963260
this incompletes - did you trigger another one?
Updated by okurz over 5 years ago
- Blocked by action #53339: [opensuse] test fails in swing due to incorrect rendering on 16bpp framebuffers added
Updated by okurz over 5 years ago
- Subject changed from [opensuse][u][functional][sporadic][medium] test fails in kontact - needs workaround for boo#1105207, then akregator not closed to [opensuse][sporadic][medium] test fails in kontact - needs workaround for boo#1105207, then akregator not closed
- Status changed from Feedback to Blocked
no, not yet. I plan to check the stability of the tests after proceeding with #53339 which can have quite some impact. I guess I can take it for the time being and therefore bring it outside QSF-u. I guess you appreciate :)
Updated by mgriessmeier over 5 years ago
- Target version changed from Milestone 26 to Milestone 28
Updated by okurz about 5 years ago
- Subject changed from [opensuse][sporadic][medium] test fails in kontact - needs workaround for boo#1105207, then akregator not closed to [functional][u][opensuse][sporadic][medium] test fails in kontact - needs workaround for boo#1105207, then akregator not closed
- Status changed from Blocked to Workable
- Assignee deleted (
okurz)
blocker resolved, back to QSF-u
Updated by zluo almost 5 years ago
- Status changed from Workable to In Progress
- Assignee set to zluo
let me check current status of kontact.
Updated by zluo almost 5 years ago
- Blocks deleted (action #41540: [functional][u][sporadic] test fails in kontact as command "killall" is mistyped in x11_start_program (seems plasma specific problem))
Updated by zluo almost 5 years ago
I checked on o3 for tw, atm we don't have any issue with kontact.
trigger 50 test run on f40.suse.de and see whether I can reproduce any issue with it.
Updated by zluo almost 5 years ago
check_bsc982138; of start_install.pm is not necessary since the production issue got fixed.
Updated by zluo almost 5 years ago
found issue with bootloader, this is quite strange:
http://f40.suse.de/tests/5844#step/bootloader/1 bootmenu-TW-xmas-20191209 matched but then the test failed :((
Updated by zluo almost 5 years ago
- Status changed from In Progress to Rejected
So reject this ticket for now because this issue doesn't exist anymore on o3 or on my test runs: