Project

General

Profile

action #50111

coordination #49889: [functional][epic][y] Switch between installation/install shell in specific scenarios (hyperv, ssh,vnc)

[functional][y] Switch between installation/install shell in vnc installation

Added by riafarov over 2 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
SUSE QA - Milestone 30+
Start date:
2019-04-02
Due date:
% Done:

0%

Estimated time:
8.00 h

Description

Motivation

Currently There are several cases where we cannot switch to a install shell when we are in an installation, however being able to switch to a console can be very useful.

In the https://openqa.suse.de/tests/latest?arch=x86_64&flavor=Installer-DVD&distri=sle&test=remote_vnc_controller&version=15-SP1&machine=64bit we have completely wrong setup.
In case we want to collect the logs, logs are collected from the image of SLE 12 SP3 instead of SUT, which is wrong and dangerous.

Acceptance criteria

  1. Logs can be collected from the SUT in case of MM VNC installation

Further details

This is a change in os_autoinst. select_console is a function in testapi.pm.
The way to swith between consoles is:
select_console 'install-shell';
select_console 'installation';

Possible solutions:
We can enable ssh on the machine with installation and connect to it to collect logs.
Alternative would be to collect logs on the machine where installation is running.

In theory is should be possible to switch tty over VNC, but will be tricky in the VM to which we are already connected over VNC.

Screenshot from 2019-10-11 10-29-29.png (88.1 KB) Screenshot from 2019-10-11 10-29-29.png have we tried F8 key? JERiveraMoya, 2019-10-11 08:35
8879

Related issues

Blocks openQA Tests - action #49622: [functional][y] Verify the wrong desktop will show upResolved2018-12-122020-04-21

History

#1 Updated by riafarov over 2 years ago

  • Copied from coordination #49889: [functional][epic][y] Switch between installation/install shell in specific scenarios (hyperv, ssh,vnc) added

#2 Updated by riafarov over 2 years ago

  • Copied from deleted (coordination #49889: [functional][epic][y] Switch between installation/install shell in specific scenarios (hyperv, ssh,vnc))

#3 Updated by riafarov over 2 years ago

  • Parent task set to #49889

#4 Updated by riafarov over 2 years ago

  • Description updated (diff)

#5 Updated by riafarov over 2 years ago

  • Description updated (diff)
  • Estimated time set to 8.00 h

#6 Updated by riafarov over 2 years ago

  • Status changed from New to Workable

#7 Updated by riafarov over 2 years ago

  • Due date changed from 2019-05-07 to 2019-05-21

#8 Updated by riafarov over 2 years ago

  • Due date changed from 2019-05-21 to 2019-06-04
  • Target version changed from Milestone 24 to Milestone 25

#9 Updated by riafarov over 2 years ago

  • Blocks action #49622: [functional][y] Verify the wrong desktop will show up added

#10 Updated by ybonatakis over 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to ybonatakis

#11 Updated by riafarov over 2 years ago

  • Due date changed from 2019-06-04 to 2019-06-18

#12 Updated by ybonatakis over 2 years ago

  • Due date changed from 2019-06-18 to 2019-06-04

Most of the approaches and tests didnt work. What is promising is a solution i found on stackoverflow. in a nutshell, you go to the qemu console and make use of sendkey or maybe even chvt.

The test this in a local qemu image and seemed to work.

When i tried the same in the test connected from VNC i could switch to the qemu console but not send any command. the problem could be in the parameters where the qemu takes. those were for the test that i was using:

/usr/bin/qemu-system-x86_64 -vga cirrus -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -soundhw ac97 -global isa-fdc.driveA= -m 1024 -cpu qemu64 -netdev tap,id=qanet0,ifname=tap1,script=no,downscript=no -device virtio-net,netdev=qanet0,mac=52:54:00:12:00:17 -boot order=c,menu=on,splash-time=5000 -device usb-ehci -device usb-tablet -smp 1 -enable-kvm -no-shutdown -vnc :92,share=force-shared -device virtio-serial -chardev socket,path=virtio_console,server,nowait,id=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/var/lib/openqa/pool/2/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on -device virtio-blk,id=hd0-device,drive=hd0-overlay0,bootindex=0,serial=hd0 -blockdev driver=file,node-name=cd0-overlay0-file,filename=/var/lib/openqa/pool/2/raid/cd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay0,file=cd0-overlay0-file,cache.no-flush=on -device scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0

#13 Updated by riafarov over 2 years ago

  • Due date changed from 2019-06-04 to 2019-06-18

#14 Updated by ybonatakis over 2 years ago

the solution so far involves the opening of xterm and collecting the logs from the target (where the installation is taking place). the problem with this is that the assert_script_run redirects to the serial device that the command is executed. this causes the assert_script_run failing with timeout. One solution is to use script_run or as it is suggested to use type_string in those occasions.

#16 Updated by ybonatakis over 2 years ago

the approach is not unified for both VNC and ssh cases.
As it was suggested, i tried to use the root-ssh. the problem i encountered with this is shown in the logs as following

[2019-06-17T15:21:04.785 CEST] [debug] <<< testapi::select_console(testapi_console='root-ssh')
console root-ssh does not exist at /usr/lib/os-autoinst/backend/driver.pm line 95.
[2019-06-17T15:21:04.785 CEST] [debug] Backend process died, backend errors are reported below in the following lines:
Can't call method "select" on an undefined value at /usr/lib/os-autoinst/backend/baseclass.pm line 556.

Taking a deeper look, i discovered that the root-ssh is not used on qemu backend.
lib/susedistribution.pm

if (get_var('BACKEND', '') =~ /ikvm|ipmi|spvm/)

Then i tried to test adding qemu to the list of patterns. the following restriction appears

Could not retrieve required variable SUT_IP at /var/lib/openqa/share/tests/sle/lib/susedistribution.pm line 435.

i tried, then, to set_var the variable in different location in the test without success and having the same error.

#17 Updated by riafarov over 2 years ago

  • Due date changed from 2019-06-18 to 2019-07-02

#18 Updated by riafarov over 2 years ago

  • Target version changed from Milestone 25 to Milestone 26

#19 Updated by ybonatakis over 2 years ago

for future reference, it is also possible to collect logs from the system using the start_shell in the boot option. this is will give a console which can be used even when openqa crushes

#20 Updated by ybonatakis over 2 years ago

  • Status changed from In Progress to Feedback

#21 Updated by riafarov over 2 years ago

  • Due date changed from 2019-07-02 to 2019-07-16

PR is still open.

#22 Updated by riafarov over 2 years ago

  • Due date changed from 2019-07-16 to 2019-07-30
  • Estimated time deleted (8.00 h)

VR is missing to verify changes in the code.

#23 Updated by riafarov over 2 years ago

  • Due date changed from 2019-07-30 to 2019-08-13
  • Target version changed from Milestone 26 to Milestone 27

Needs another round of review.

#24 Updated by ybonatakis over 2 years ago

waiting for some VR on the production to close it

#25 Updated by ybonatakis over 2 years ago

The pull request has been merged but it does not work [0]. At least not for ssh. i have triggered the vnc job [1] to get a full picture.

[0] https://openqa.suse.de/tests/3237658#step/wel come/18
[1] https://openqa.suse.de/tests/3238021#

#26 Updated by ybonatakis over 2 years ago

  • Assignee deleted (ybonatakis)

#27 Updated by ybonatakis over 2 years ago

vnc is also failing to upload the logs https://openqa.suse.de/tests/3238021#

#28 Updated by riafarov over 2 years ago

  • Due date changed from 2019-08-13 to 2019-09-10
  • Status changed from Feedback to Workable
  • Target version changed from Milestone 27 to Milestone 30+

We basically need to start over here and this time I would suggest to fix that in the console handling instead of hacking single scenario.

#29 Updated by riafarov over 2 years ago

  • Estimated time set to 8.00 h

https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/susedistribution.pm#L651 here is piece of logic, where we need to redirect root-console to use ssh connection instead for vnc remote installations.

#30 Updated by ybonatakis over 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to ybonatakis

#31 Updated by ybonatakis about 2 years ago

  • Status changed from In Progress to Workable

#32 Updated by ybonatakis about 2 years ago

  • Status changed from Workable to In Progress

#33 Updated by riafarov about 2 years ago

  • Due date changed from 2019-09-10 to 2019-09-24

#34 Updated by ybonatakis about 2 years ago

  • Status changed from In Progress to Workable
  • Assignee deleted (ybonatakis)

#35 Updated by ybonatakis about 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to ybonatakis

#36 Updated by JERiveraMoya about 2 years ago

  • Due date changed from 2019-09-24 to 2019-10-08

We failed to look a this ticket. Moving to next sprint.

#37 Updated by ybonatakis about 2 years ago

as i havent found a solution yet based on the requirement i submitted a PR which at least should solve the latest fail to upload the logs when a module fails https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8610

#38 Updated by riafarov about 2 years ago

  • Due date changed from 2019-10-08 to 2019-10-22

#39 Updated by ybonatakis about 2 years ago

  • Status changed from In Progress to Workable
  • Assignee deleted (ybonatakis)

VR in OSD doesnt work. the reason seems to be that the VM cant communicate with the server set by autoinst_url("/uploadlog/$upname"). i dont know why this work locally. i will close the PR.

#41 Updated by JERiveraMoya about 2 years ago

  • Status changed from Workable to In Progress

We cannot reuse solution for ipmi due to in ipmi there is two machines and one worker and this worker has access to a physical serial device named SOL, but in Multi-Machine setup we have two machines and one worker for each machine, where the worker in one them (controller) cannot have exclusive access to serial (which virtual provide by qemu) in the other machine (target), meaning that even if we could try to connect both would be a mess for other jobs running, as we would be hacking the serial for the target machine. Discarded the possibility to reuse ipmi approach I'm taking a look for running commands pre-pending ssh, which in some way is used also in this PR but we might need more simple approach for our scenario.

#42 Updated by okurz about 2 years ago

JERiveraMoya wrote:

We cannot reuse solution for ipmi due to in ipmi there is two machines and one worker and this worker has access to a physical serial device named SOL, but in Multi-Machine setup we have two machines and one worker for each machine

There is no direct necessity to handle any IPMI machines different to qemu machines. In principle the idea is that every SUT has both at least a graphical-capable console as well as a serial console.

I'm taking a look for running commands pre-pending ssh, which in some way is used also in this PR

That should only be necessary if you have remote SUTs that are not directly available, e.g. cloud VMs or in case of https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8099 it was a SUT within a separated IBM network that could only be reached over VPN and a two-jump ssh tunnel.

#43 Updated by ybonatakis about 2 years ago

from my investigation i didnt see any solution on impi or other system. When they fail the log gathering is not working

#44 Updated by JERiveraMoya about 2 years ago

  • Assignee set to JERiveraMoya

#45 Updated by JERiveraMoya about 2 years ago

  • Status changed from In Progress to Feedback

#46 Updated by JERiveraMoya about 2 years ago

  • Status changed from Feedback to Resolved

VR: for instance, recent failure in controller failing in beta warning needle and still we get logs in target.

#47 Updated by riafarov about 2 years ago

https://openqa.suse.de/tests/3501119# Works like a charm. Good job here!

#48 Updated by riafarov about 1 year ago

  • Due date deleted (2019-10-22)

Also available in: Atom PDF