[qam][virtio][sle15sp0][sle15sp1][desktop] test fails in window_system because "typing string is too fast in wayland"
|Status:||In Progress||Start date:||12/10/2018|
|Category:||Bugs in existing tests|
openQA test in scenario sle-15-SP1-Installer-DVD-x86_64-desktopapps-remote-client1@64bit-virtio-vga fails in
There are many failures in "window_system" here: https://openqa.suse.de/tests/overview?distri=sle&version=15-SP1&build=66.2&groupid=118
These failures can go away if you restart the jobs a few times, but it should get proper fix.
The cause is that "script_output" calls "type_string", which types many characters too fast into gnome, causing some characters missing or repeated.
Similar issues happens on PPC64 platform, so there is "VNC_TYPING_LIMIT=10" in ppc64 worker.
I tested on OSD, this setting works fine for gnome.
(This setting means how many keys are typed into VNC in one second, default value is 50.)
In other tests, there are also typing failures from time to time.
So I wonder if this setting should be added to the sle-15-desktop medium type?
Fails since (at least) Build 66.2 (current job)
Last good: 63.1 (or more recent)
Always latest result in this scenario: latest
possible misstype ninjakeys
So I wonder if this setting should be added to the sle-15-desktop medium type?
I doubt this is a good idea. Also I do not think that "we type too fast". I guess the system is just loaded and a bit unresponsive just after startup. One can consider this a product bug.
I would suggest an approach similar to
which ensures with the desktop_runner module that there is a certain "cool down time" directly after login. I suggest to involve jbaier as the test module maintainer of window_system to handle this.
- Subject changed from [sle15sp1][desktop] test fails in window_system because "typing string is too fast in gnome" to [qam][sle15sp0][sle15sp1][desktop] test fails in window_system because "typing string is too fast in gnome"
same on qam jobs .. time to time this fails with typing issues
@Oliver, I can confirm that the system load is not high when this error happens. Also, this error can happen elsewhere every now and then, which is very annoying.
I talked to another QA, his opinion and experience is that openQA is typing too fast: 50 key strokes a second is indeed too fast, especially for a desktop environment.
@jbaier, what's your opinion on this matter?
For me it is a little bit weird. It was working quite confidently at the time of the development. Now I see those errors more and more often. Of course, I can rewrite the test (at the cost of loosing the record_info feature for pretty presentation), however this issue could come back in other tests as there are a few tests which types in the terminal.
It would be nice to know the root of the issue (load on the worker / openQA backend issue / unreliable typing short after boot).
- Status changed from New to In Progress
There are failures in palaces other than "window_system", for example in "firefox_smoke":
And I have an important discovery: all these "typing too fast" failures happen in wayland instead of X11.
I'll try to dig deeper.
- Subject changed from [qam][sle15sp0][sle15sp1][desktop] test fails in window_system because "typing string is too fast in gnome" to [qam][sle15sp0][sle15sp1][desktop] test fails in window_system because "typing string is too fast in wayland"
I tested X11+virtio, there's no problem; so the problem happens in wayland+virtio.
I'll try to reproduce this problem outside of openQA.
Hi @zcjia, @okurz, I saw an upstream bug of qemu was reported https://bugs.launchpad.net/qemu/+bug/1802465 , it looks X is also affected but wayland is more easy to be impacted with a shorter length of trigger string. The problem is we are not sure how the issue will be escalated yet.
By the fact that there are quite some desktop testing of SLE-15-SP1 desktop were blocked by this issue, and Alpha-3 is coming:
Is there a way to add an acceptable workaround to make the testing run more reliably? Thank you!
- Description updated (diff)
So, looking at this... looks to me again to missing keys/ninjakeys: https://openqa.suse.de/tests/2167333#step/window_system/4 if you look closely... the N is never let go and looking at the latest job: https://openqa.suse.de/tests/2277128 there's a poem missing there... and the same for other jobs in the build...
- Subject changed from [qam][sle15sp0][sle15sp1][desktop] test fails in window_system because "typing string is too fast in wayland" to [qam][virtio][sle15sp0][sle15sp1][desktop] test fails in window_system because "typing string is too fast in wayland"
yes, but in this example again it's specific to virtio, see #41681
- File gnome_wayland_qxl.png added
- File gnome_wayland_qxl2.png added
- File gnome_wayland_virtio.png added
- File gnome_wayland_virtio2.png added
- File gnome_x11_qxl.png added
- File gnome_x11_virtio.png added
- File kde_wayland_qxl.png added
- File kde_wayland_virtio.png added
- File kde_wayland_virtio2.png added
- File kde_x11_qxl.png added
I ran several tests with respect to
- qxl or virtio
- x11 or wayland
- gnome or kde
It seems to me that:
qxl seems to mitigate the problem, if you could compare
gnome_wayland_virtio.pngin the attachment
x11 has much fewer typing misses but once in a while it still happens
A special case
ABCDEcauses high miss rate on shift key(around 50%) for both x11 and wayland, please refer to
*2.pngin the attachment
Above is my option and it's merely based on my observation. I ran around 1000 characters for each test. Qemu version based on stable-2.12 and queue_count enlarged to 102400.
What you do think? What other things can we tweak?
Good evaluation. I wonder why you selected libreoffice though. IMHO libreoffice is especially prone to cause mistyping, see #43889, maybe because of "spell correction" or something. The original ticket observation was about "window_system" so maybe the results would actually be different if you would conduct the experiment in a more "simple" environment, e.g. gedit/kwrite
I actually tested on both gedit and libreoffice. I will proceed with gedit here after.
I found a special case about sending key streams ending with a uppercase letter, e.g.
If you could refer to
kde_wayland_virtio2.png in the above attachment, you could found there's a obvious miss on lowercase or uppercase every two times.
The reason is as follows, when we send one character
A, the key code sent is actually like
1. capslock pressed
4. enter pressed
So capslock is always used instead of shift and now capslock gets toggled.
If a lowercase letter comes after a capital letter, the capslock would be pressed again so capslock will get back to normal.
But if key stream ends with a uppercase letter, it is observed that capslock is not pressed again and therefore toggled.
Now I'm wondering which part of code actually deals with
capslock, is it ps2 driver handling it or vncdotool?
What do you think?
Are you sure it's "caps lock" or just "shift"? I guess that it is either qemu or VNC, I doubt that vncdotool is involved here at all. https://build.opensuse.org/package/view_file/Virtualization/qemu/0025-Fix-tigervnc-long-press-issue.patch?expand=1 might be related? Just browsed the web quickly myself, maybe https://www.berrange.com/posts/2010/07/04/more-than-you-or-i-ever-wanted-to-know-about-virtual-keyboard-handling/ helps. Sorry, can't help more.
- File qemu_vnc_keypress.jpg added
Thank you for the comment!!
An update about this issue,
I checked on guest OS regarding what values kernel receive, and it turns out to be wrong.
- First on guest OS, I found the keyboard device at
/dev/input/eventX, X might subject to be any number.
- Then I monitor what this device -- eventX receives by
According to my observation, there are key misses at
/dev/input/eventX. But I have not yet find any key wrongly ordered. A key miss might cause a lot more errors on x11 or wayland.
Additionally, qemu has vnc layer, input event queue layer, ps2 key queue layer. I printed values in the queues of these lays --- and found out they are all accurate.
So I presume it is a problem about qemu interfacing with guest kernel.
I attached a graph how qemu deals with key inputs, it's made by me and hope it helps.
I found the bug resides both in qemu internals and in wayland.
In qemu, it is a concurrency problem since capslock LED is handled by another thread --- so keycode streams get messed up when capslock LED is emitted.
I opened up a bug on qemu upstream
The bug regarding wayland is observed when kernel event device receives correct keycode streams, gedit still shows wrong values. It seems to be relevant to bug 1117833: Wayland: briefffffffffff unresponsivvvvvvvvvvvveness and repeated keys This bug is not observed on x11.
To check if kernel is to blame or not, I used ftrace and dynamic_debug to peek into what kernel receives. It seems kernel receives what qemu sends it and passes all recognized key streams to
Found related bug report related to fedora.
Additionally, I will pay some attention to wayland bug, since it happens much more often comparably.
Issue raised on gnome-shell gitlab
As pointed out by zcjia, wayland typing issue only happens after gnome-shell 3.2.6.
Noted that this wayland issue is to be blame for severe typing flaws on wayland under gnome.
But typing issues on openQA involves qemu issues, and possibly other parts ---we found that
create_hdd_* cases might have user name typed wrong, but I failed to reproduce this problem in installed x11 instance where I tested around 250,000 characters through vnc 10ms per char.
Regarding wrong user name typing during installation: That should fail the test earlier in https://openqa.opensuse.org/tests/829175#step/user_settings/2 with the right needle. It looks like the last matched needle has vanished. Maybe you are onto this already?