action #25638
closed[sles][functional][s390x] test fails in shutdown: VNC stall detected, needs to be investigated
0%
Description
Observation¶
openQA test in scenario sle-15-Leanos-DVD-s390x-create_hdd_gnome_s390x@s390x-kvm fails in
shutdown (originally: shutdown )
shows similar symptoms as #20022 at least in regards of VNC stalls
Reproducible¶
Fails since (at least) Build 278.1
Steps to reproduce¶
- clone test "create_hdd_textmode@s390x-kvm"
Expected result¶
We never executed shutdown successfully in any s390x tests but we try it now.
Further details¶
Always latest result in this scenario: latest (old), now latest
Updated by mgriessmeier about 7 years ago
- Related to action #20022: [sle][functional][zkvm][s390] incomplete test due to socket does not exist. Probably your backend instance could not start or died added
Updated by mgriessmeier about 7 years ago
- Blocks action #23406: [sle][functional]Use single test suite for create_hdd_gnome on all architectures (and downstream jobs) added
Updated by riafarov about 7 years ago
VNC stalls in other test suites too:
https://openqa.suse.de/tests/1193329
https://openqa.suse.de/tests/1193329
EDIT (okurz): These two should be ignored for the case of this ticket, they do not fail in shutdown
Updated by mgriessmeier about 7 years ago
not worked on that one in particular but https://github.com/os-autoinst/os-autoinst/pull/862 might also help here if anyone wants to investigate further
Updated by riafarov about 7 years ago
PR with fix of review comment to be able to merge it: https://github.com/os-autoinst/os-autoinst/pull/864
Updated by okurz about 7 years ago
did not complete in sprint 1. main reason: spontaneous packaging training which we were not aware of in before. we have the PR which should improve user feedback a lot and this is definitely possible in the next sprint 2.
Updated by okurz about 7 years ago
- Due date changed from 2017-10-11 to 2017-10-25
Updated by okurz about 7 years ago
- Assignee deleted (
mgriessmeier)
mgriessmeier in vacation, unassigning for now.
Updated by zluo about 7 years ago
shutdown reached target but it stops for further tests:
fixed however issue in shutdown.pm for textmode:
# s390x on SLE15 does not have a X11/VNC server
if (is_sle && sle_version_at_least('15') && check_var('ARCH', 's390x')) {
power_action('poweroff', textmode => 1);
}
power_action('poweroff');
Updated by riafarov about 7 years ago
- Related to action #18936: [tools][sles][functional] Enable 3 stress acceptance on s390x added
Updated by riafarov about 7 years ago
- Related to action #13216: [sles][functional][s390x] Run extratest on s390x added
Updated by zluo about 7 years ago
the problem is that after shutdown the worker still keeps X11 session alive. We need to find a way to terminate it.
Will discuss next week with @okurz and others.
Updated by zluo about 7 years ago
blocked at moment because of no idea to handle with running x11 session and s390-kvm is not available now...
Updated by okurz about 7 years ago
- Status changed from New to In Progress
the ticket is not "new" anymore -> "in progress". Please discuss with mgriessmeier how to make sure we do not conflict each other with instances.
Updated by zluo about 7 years ago
atm I cannot work on this ticket because s390x-kvm is not ready.
I got an idea to workaround this issue with running x11 session:
create a needle and select_console and return
Updated by mgriessmeier about 7 years ago
- Subject changed from [sles][functional][s390x] test fails in shutdown: VNC stall detected, needs to be investigated to [sles][functional][s390x][s390x-kvm] test fails in shutdown: VNC stall detected, needs to be investigated
- Status changed from In Progress to Rejected
- Assignee changed from zluo to mgriessmeier
We don't know if we ever saw this on production - linked job urls are a different problem
Updated by riafarov about 7 years ago
- Status changed from Rejected to New
How failed in production on x-kvm. See https://openqa.suse.de/tests/1223349#
Updated by okurz about 7 years ago
- Priority changed from Normal to Urgent
Fails in many more jobs in build 305.1
Updated by mgriessmeier about 7 years ago
- Status changed from New to In Progress
Updated by mgriessmeier about 7 years ago
- Subject changed from [sles][functional][s390x][s390x-kvm] test fails in shutdown: VNC stall detected, needs to be investigated to [sles][functional][s390x] test fails in shutdown: VNC stall detected, needs to be investigated
- Status changed from In Progress to New
Updated by mgriessmeier about 7 years ago
- Status changed from New to In Progress
Updated by mgriessmeier about 7 years ago
Was investigating this with riafarov.
We came to the conclusion that it's a weird backend behaviour in assert_shutdown
in backend/testapi.pm
and therefore created a ticket for the tools team as a blocker for this: https://progress.opensuse.org/issues/26886
07:51:42.7571 4947 <<< testapi::type_string(string='poweroff
', max_interval=250, wait_screen_changes=0, wait_still_screen=0)
[[0;32m OK [0m] Stopped target Timers.
[[0;32m OK [0m] Stopped Daily Cleanup of Temporary Directories.
[[0;32m OK [0m] Stopped Early Kernel Boot Messages.
[[0;32m OK [0m] Stopped target Multi-User System.
Stopping OpenSSH Daemon...
[[0;32m OK [0m] Stopped target Network is Online.
Stopping Command Scheduler...
Stopping Session 1 of user root.
[[0;32m OK [0m] Removed slice system-systemd\x2dhibernate\x2dresume.slice.
Stopping Load kdump kernel and initrd...
Stopping User Manager for UID 0...
07:51:42.9778 Debug: /var/lib/openqa/cache/tests/sle/tests/shutdown/shutdown.pm:32 called utils::power_action
[[0;32m OK [0m] Removed slice system-getty.slice.
07:51:42.9780 4947 <<< testapi::assert_shutdown(timeout=60)
[[0;32m OK [0m] Stopped /etc/init.d/after.local Compatibility.
07:51:43.0602 4949 Connection to root@s390p8.suse.de established
07:51:43.1630 4949 Command executed: ! virsh dominfo openQA-SUT-2 | grep -w 'shut off', ret=0
[[0;32m OK [0m] Stopped target Login Prompts.
[[0;32m OK [0m] Stopped Discard unused blocks once a week.
Stopping System Logging Service...
Stopping Restore /run/initramfs on shutdown...
Stopping Load kdump kernel early on startup...
Stopping Serial Getty on ttysclp0...
[[0;32m OK [0m] Stopped System Logging Service.
[[0;32m OK [0m] Stopped Serial Getty on ttysclp0.
[[0;32m OK [0m] Stopped OpenSSH Daemon.
[[0;32m OK [0m] Stopped Command Scheduler.
Stopping Postfix Mail Transport Agent...
[[0;32m OK [0m] Stopped /etc/init.d/boot.local Compatibility.
[[0;32m OK [0m] Removed slice system-serial\x2dgetty.slice.
[[0;32m OK [0m] Stopped Session 1 of user root.
[[0;32m OK [0m] Stopped Restore /run/initramfs on shutdown.
[[0;32m OK [0m] Stopped Load kdump kernel early on startup.
[[0;32m OK [0m] Stopped Load kdump kernel and initrd.
[[0;32m OK [0m] Stopped Postfix Mail Transport Agent.
[[0;32m OK [0m] Stopped target Host and Network Name Lookups.
[[0;32m OK [0m] Stopped User Manager for UID 0.
[[0;32m OK [0m] Removed slice User Slice of root.
Stopping Login Service...
Stopping Permit User Sessions...
[[0;32m OK [0m] Stopped Login Service.
[[0;32m OK [0m] Stopped Permit User Sessions.
[[0;32m OK [0m] Stopped target User and Group Name Lookups.
Stopping Name Service Cache Daemon...
[[0;32m OK [0m] Stopped target Network.
Stopping wicked managed network interfaces...
[[0;32m OK [0m] Stopped target Remote File Systems.
[[0;32m OK [0m] Stopped target Remote File Systems (Pre).
[[0;32m OK [0m] Stopped Name Service Cache Daemon.
[[0;32m OK [0m] Stopped wicked managed network interfaces.
Stopping wicked network nanny service...
[[0;32m OK [0m] Stopped wicked network nanny service.
Stopping wicked network management service daemon...
[[0;32m OK [0m] Stopped wicked network management service daemon.
Stopping wicked DHCPv4 supplicant service...
Stopping wicked AutoIPv4 supplicant service...
Stopping wicked DHCPv6 supplicant service...
[[0;32m OK [0m] Stopped wicked DHCPv4 supplicant service.
[[0;32m OK [0m] Stopped wicked DHCPv6 supplicant service.
[[0;32m OK [0m] Stopped wicked AutoIPv4 supplicant service.
Stopping D-Bus System Message Bus...
[[0;32m OK [0m] Stopped D-Bus System Message Bus.
[[0;32m OK [0m] Stopped target Basic System.
[[0;32m OK [0m] Stopped target Sockets.
[[0;32m OK [0m] Closed Syslog Socket.
[[0;32m OK [0m] Stopped target Slices.
[[0;32m OK [0m] Removed slice User and Session Slice.
[[0;32m OK [0m] Stopped target Paths.
[[0;32m OK [0m] Closed D-Bus System Message Bus Socket.
[[0;32m OK [0m] Stopped target System Initialization.
[[0;32m OK [0m] Stopped Update is Completed.
[[0;32m OK [0m] Stopped target Swap.
Deactivating swap /dev/disk/by-uuid…7d3-20c2-47b0-a965-339f4851c325...
[[0;32m OK [0m] Stopped target Encrypted Volumes.
[[0;32m OK [0m] Stopped Dispatch Password Requests to Console Directory Watch.
[[0;32m OK [0m] Stopped Rebuild Journal Catalog.
[[0;32m OK [0m] Stopped Apply Kernel Variables.
Stopping Load/Save Random Seed...
[[0;32m OK [0m] Stopped Rebuild Hardware Database.
[[0;32m OK [0m] Stopped Commit a transient machine-id on disk.
Stopping Update UTMP about System Boot/Shutdown...
[[0;32m OK [0m] Stopped Load Kernel Modules.
[[0;32m OK [0m] Stopped Update UTMP about System Boot/Shutdown.
[[0;32m OK [0m] Stopped Create Volatile Files and Directories.
[[0;32m OK [0m] Stopped Flush Journal to Persistent Storage.
[[0;32m OK [0m] Stopped target Local File Systems.
Unmounting /var/lib/mariadb...
Unmounting /var/lib/pgsql...
Unmounting /var/crash...
Unmounting /boot/grub2/s390x-emu...
Unmounting /opt...
Unmounting /var/lib/libvirt/images...
Unmounting /var/cache...
Unmounting /var/lib/named...
Unmounting /run/user/0...
Unmounting /.snapshots...
Unmounting /boot/zipl...
Unmounting /usr/local...
Unmounting /var/opt...
Unmounting /var/tmp...
Unmounting /srv...
Unmounting /var/lib/mysql...
Unmounting /tmp...
Unmounting /var/lib/machines...
Unmounting /var/spool...
Unmounting /var/lib/mailman...
Unmounting /var/log...
[[0;32m OK [0m] Stopped Load/Save Random Seed.
[[0;32m OK [0m] Unmounted /usr/local.
[[0;32m OK [0m] Unmounted /var/tmp.
[[0;32m OK [0m] Unmounted /tmp.
[[0;32m OK [0m] Unmounted /var/lib/mailman.
[[0;32m OK [0m] Deactivated swap /dev/disk/by-path/ccw-0.0.0000-part3.
[[0;32m OK [0m] Deactivated swap /dev/disk/by-partu…6a711-92d2-4a69-8bab-84e4628c909e.
[[0;32m OK [0m] Deactivated swap /dev/vda3.
[[0;32m OK [0m] Deactivated swap /dev/disk/by-uuid/…027d3-20c2-47b0-a965-339f4851c325.
[[0;32m OK [0m] Unmounted /var/log.
[[0;32m OK [0m] Unmounted /.snapshots.
[[0;32m OK [0m] Unmounted /var/cache.
[[0;32m OK [0m] Unmounted /var/spool.
[[0;32m OK [0m] Unmounted /var/lib/pgsql.
[[0;32m OK [0m] Unmounted /var/crash.
[[0;32m OK [0m] Unmounted /boot/grub2/s390x-emu.
[[0;32m OK [0m] Unmounted /opt.
[[0;32m OK [0m] Unmounted /var/lib/libvirt/images.
[[0;32m OK [0m] Unmounted /run/user/0.
[[0;32m OK [0m] Unmounted /var/opt.
[[0;32m OK [0m] Unmounted /srv.
[[0;32m OK [0m] Unmounted /var/lib/machines.
[[0;32m OK [0m] Unmounted /var/lib/named.
[[0;32m OK [0m] Unmounted /var/lib/mysql.
[[0;32m OK [0m] Unmounted /var/lib/mariadb.
[[0;32m OK [0m] Unmounted /boot/zipl.
[[0;32m OK [0m] Reached target Unmount All Filesystems.
[[0;32m OK [0m] Stopped target Local File Systems (Pre).
[[0;32m OK [0m] Stopped Create Static Device Nodes in /dev.
[[0;32m OK [0m] Stopped Create System Users.
[[0;32m OK [0m] Stopped Remount Root and Kernel File Systems.
[[0;32m OK [0m] Reached target Shutdown.
Updated by mgriessmeier about 7 years ago
- Blocked by action #26886: [tools][s390x-kvm] investigate and improve 'assert_shutdown' function in testapi added
Updated by okurz about 7 years ago
- Description updated (diff)
We should keep one thing in mind. Unless I am mistaken we only have successfully executed "shutdown" on zkvm. So we shouldn't hunt for a regression on s390x-kvm and z/VM.
https://openqa.suse.de/tests/overview?distri=sle&version=12-SP3&build=0473&arch=s390x are all SLE 12 SP3 GM s390x tests for reference. IIUC for s390x zVM we do not have an implementation for "is_shutdown" so there should be the message "Backend does not implement is_shutdown - just sleeping". For s390x-kvm that should be svirt calling ! virsh dominfo $vmname | grep -w 'shut off'
which we see in autoinst-log.txt
I see an easy way out: We just skip everything that does not work on s390x (s390x-kvm and z/VM).
I tried to reproduce the problem on zVM locally with a simplified test plan by trying to shutdown but failed to show the problem. It just works fine: http://lord.arch/tests/7741/file/autoinst-log.txt
We have seen the problem only on s390x-kvm and now on zVM as well but not on zkvm, correct?
Updated by mgriessmeier about 7 years ago
for now we just skip assert_shutdown on s390x-kvm and z/VM:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3752
Updated by okurz about 7 years ago
- Copied to action #26914: [sle][functional][s390x][s390x-kvm] s390x-kvm does never exit from "assert_shutdown" but zkvm works -> investigate how the machines differ, maybe problem on s390p8? added
Updated by okurz about 7 years ago
- Status changed from In Progress to Feedback
mgriessmeier and me opted for the "easy way out" -> https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3752 , merged.
When jobs don't incomplete anymore we should create another ticket to do a more deep investigation why s390x-kvm and zkvm behave different here. I think the z/VM part is covered by #26886 because the backend implementation is basically a sleep only so it has nothing to do with actual execution on z/VM.
Rest handled in #26914
skip_registration s390x-kvm passed shutdown now
Can't close ticket? set to feedback now, something blocking here?
Updated by okurz about 7 years ago
- Blocked by deleted (action #26886: [tools][s390x-kvm] investigate and improve 'assert_shutdown' function in testapi)
Updated by okurz about 7 years ago
- Related to action #26886: [tools][s390x-kvm] investigate and improve 'assert_shutdown' function in testapi added
Updated by okurz about 7 years ago
- Status changed from Feedback to Resolved
Not blocked by #26886 anymore, closing.