Project

General

Profile

Actions

action #64833

closed

[SLE][Migration][SLE15SP3] Add testcase with recommended modules from SLES 15 SP1 to SP2 on powerVM on spvm backend

Added by leli about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
New test
Target version:
-
Start date:
2020-03-26
Due date:
% Done:

100%

Estimated time:
50.00 h
Difficulty:

Description

Requirement from Marita:

Would it be possible to have 1 default migration testcase from SLES 15 SP1 to SP2 on powerVM. For that we have the new spvm backend in openQA.

Actions #1

Updated by leli about 4 years ago

  • Priority changed from Normal to High
Actions #2

Updated by hjluo about 4 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 30

added test cases: offline_sles15sp1_pscc_basesys-srv-desk-dev_all_full_spvm and online_sles15sp1_pscc_basesys-srv_all_full_y_spvm

Actions #3

Updated by hjluo about 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 30 to 100

test cases added to daily migration group.

Actions #4

Updated by leli about 4 years ago

  • Status changed from Resolved to Workable

It should be debug, I haven't seen verify log.

Actions #6

Updated by hjluo about 4 years ago

we need to create a new qcow2 for spvm.

Actions #7

Updated by okurz about 4 years ago

I don't think you can use qcow files on spvm unless you add functionality to deploy spvm VMs from qcow

Actions #8

Updated by hjluo about 4 years ago

  • % Done changed from 80 to 30

Hi Oliver,

Thanks for the information, what do you mean "add functionality to deploy spvm VMs from qcow"? Does it workable for use spvm backend to
run migration test?

Thanks.

Actions #9

Updated by leli about 4 years ago

okurz wrote:

I don't think you can use qcow files on spvm unless you add functionality to deploy spvm VMs from qcow

My understanding is we should use qcow2 created on spvm for the test, right?
I have create another ticket to create SLE15SP1 qcow2 on spvm backend. https://progress.opensuse.org/issues/65055

Ok, thanks for the clarification and we'll try once we got that kind of qcow2 which was created on the spvm.

Actions #10

Updated by hjluo about 4 years ago

Lemon Confirmed with Rodion, PowerVM doesn't support snapshot so we can't create qcow2 for spvm. We need chain jobs to realize the test just like z/VM test. This is the job to create qcow2 on spvm, case passed but no qcow2 created. http://openqa.nue.suse.com/tests/4075194

Actions #12

Updated by hjluo about 4 years ago

Hi marita, thanks for the information.

Actions #13

Updated by hjluo about 4 years ago

  • % Done changed from 30 to 70

now added 2 test cases:
offline_sles15sp1_media_basesys-srv_all_full_spvm_peperation
offline_sles15sp1_media_basesys-srv_all_full_spvm

part1 http://149.44.176.58/tests/4108825

Actions #14

Updated by hjluo about 4 years ago

Hi Rodion,

We migration team are now thinking of ways to run migration on spvm backend, After investigation by Lemon and you.

we are now using 2 steps to do the migration. first thing is create hdd_image and the 2nd step is run the migration on

the same worker.

Now the 1st step works well as https://openqa.nue.suse.com/tests/4114972

but we are struck at the 2nd step, after patch_sle we reboot and it can't back to our serial console.

https://openqa.nue.suse.com/tests/4114973#step/patch_sle/29

One thing weird I found is that it use /dev/sshserial not /dev/hvc0 .

So could you please help us to check if there any settings we're missing or misused?

Thanks in advance.

Huajian.Luo
Hi Huajian!

First thing is that I would recommend using ppc64le-hmc workers as they use hmc management console which allows more actions than novalink and hence will get better support. As of now we don't run any tests on spvm workers.

For the problem you have mentioned, it's problem with reboot. For powerVM backend, similar to zVM, we need to reconnect so called management consoles after reboot is triggered. See https://openqa.suse.de/tests/4125812# as an example.

I can also see that in sle/tests/update/patch_sle.pm we don't use power_actions method, which is recommended to use and contains additional steps for different platforms. Unfortunately, just typing reboot works reliably on qemu backend only.

As per serial setting it looks indeed strange, but in vars.json I can see that setting is set to the correct value. I believe you are right and we should have this set to sshserial in the machine definition.

Both spvm and hmc are backends for powerVM platform, so actually from the test development perspective both do same thing, even though they use different stack. As far as I know, we have implemented hmc backend because it will potentially allow us to work with snapshots (as powerVM platform supports making images and booting from them).

So in short, as of now spvm and hmc provide same functionality, but hmc will be the one to get further improvements.

Keep in mind that this is just recommendation and not a strict requirement.

Actions #15

Updated by hjluo about 4 years ago

now trigger with machine=ppc64le-hmc ids => [4138575, 4138576]

phrase1 http://149.44.176.58/tests/4138575 PASSED
phrase2 http://149.44.176.58/tests/4138576 FAILED at patch_sle. after reboot we can't get the console back.

Actions #16

Updated by hjluo about 4 years ago

Use power_action in patch_sle as Rodion suggested.

  • #type_string "reboot\n";
  • power_action('reboot', keepconsole => 1, textmode => 1);

http://openqa.suse.de/t4142789 and http://openqa.suse.de/t4142790

Actions #17

Updated by hjluo about 4 years ago

Now use Rodion and Gaowei's suggestions to run like this.

if (is_pvm()) {
diag 'Called power_action reboot textmode=1 ....';
power_action('reboot', textmode => 1);
prepare_system_shutdown;
reconnect_mgmt_console(timeout => 500);
}

http://openqa.nue.suse.com/t4160277
http://openqa.nue.suse.com/t4160278

it failed at http://149.44.176.58/tests/4160278#step/patch_sle/32 and complains with
Test died: Error connecting to root@redcurrant-7.qa.suse.de: No route to host at /usr/lib/os-autoinst/testapi.pm line 1622.

Actions #18

Updated by hjluo about 4 years ago

new run with online patch in zypper_patch.pm

http://openqa.nue.suse.com/t4182054

http://openqa.nue.suse.com/t4182055

timeout at zypper_migration

All repositories have been refreshed.
Available migrations:

1 | SUSE Linux Enterprise Server 15 SP2 ppc64le
    Basesystem Module 15 SP2 ppc64le
    Python 2 Module 15 SP2 ppc64le
    Server Applications Module 15 SP2 ppc64le

XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":45877"
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":48231"
after 28853 requests (28853 known processed) with 0 events remaining.
after 28692 requests (28692 known processed) with 0 events remaining.
[2020-04-30T12:53:50.642 CEST] [debug] backend got TERM
[2020-04-30T12:53:50.642 CEST] [debug] signalhandler got TERM
[2020-04-30T12:53:50.643 CEST] [debug] Closing SSH serial connection with redcurrant-6.qa.suse.de
[2020-04-30T12:53:50.643 CEST] [debug] terminating command server 342764 because test execution ended
[2020-04-30T12:53:50.643 CEST] [debug] isotovideo: informing websocket clients before stopping command server: http://127.0.0.1:20263/OHepLh39Fq9PUoLo/broadcast
[2020-04-30T12:53:50.643 CEST] [debug] flushing frames
[2020-04-30T12:53:50.643 CEST] [debug] autotest received signal TERM, saving results of current test before exiting
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":36233"
after 28858 requests (28858 known processed) with 0 events remaining.
[2020-04-30T12:53:50.647 CEST] [debug] [autotest] process exited: 1
[2020-04-30T12:53:50.647 CEST] [debug] commands process exited: 0

Actions #19

Updated by hjluo almost 4 years ago

new run:

http://openqa.nue.suse.com/t4183118 and http://openqa.nue.suse.com/t4183119
and zypper migration complained with "mkfifo failed for /dev/sshserial"
mkfifo: cannot create fifo '/dev/sshserial': File exists

Lemon suggested to use following code to verify the error:
Need diag '$out' to check why not enter '1'. '''if ($out =~ $zypper_migration_target) {
my $version = get_var("VERSION");
$version =~ s/-/ /;
if ($out =~ /(\d+)\s+|\s+SUSE Linux Enterprise.*?$version/m) {
send_key "$1";
}'''

Actions #21

Updated by okurz almost 4 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: offline_sles15sp1_media_basesys-srv_all_full_spvm@ppc64le-hmc
https://openqa.suse.de/tests/4262704

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released"
  3. The label in the openQA scenario is removed
Actions #23

Updated by hjluo almost 4 years ago

online with fix:
openqa-clone-custom-git-refspec https://github.com/hjluo/os-autoinst-distri-opensuse/tree/ppc64le-spvm http://openqa.nue.suse.com/tests/4274294 -c " --apikey XXXXX --apisecret XXXXXX" _GROUP=0 -c " --parental-inheritance" MAX_JOB_TIME=3600000 MIGRATION_METHOD=yast
http://openqa.nue.suse.com/t4274345
http://openqa.nue.suse.com/t4274346

http://openqa.nue.suse.com/tests/4288051#step/yast2_migration/25

Actions #24

Updated by hjluo almost 4 years ago

offline with fix:
barry:~/:[0]# openqa-clone-custom-git-refspec https://github.com/hjluo/os-autoinst-distri-opensuse/tree/ppc64le-spvm http://openqa.nue.suse.com/tests/4279317 -c " --apikey XXXX --apisecret XXXX" _GROUP=0 -c " --parental-inheritance" MAX_JOB_TIME=3600000

http://openqa.nue.suse.com/t4279356
http://openqa.nue.suse.com/t4279357

Actions #25

Updated by hjluo almost 4 years ago

change power_action
power_action('reboot', observe => 1, keepconsole => 1, first_reboot => 1);
http://openqa.nue.suse.com/t4281828 and http://openqa.nue.suse.com/t4281829
http://openqa.nue.suse.com/tests/4281829#step/reboot_to_upgrade/6 works and now change the rest reboot to try again.

http://openqa.nue.suse.com/t4287801 and http://openqa.nue.suse.com/t4287802

Actions #26

Updated by hjluo almost 4 years ago

new online zypper migration run
http://openqa.nue.suse.com/t4287804 and http://openqa.nue.suse.com/t4287805

XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":43907"
after 28811 requests (28811 known processed) with 0 events remaining.

rerun to verify:
http://openqa.nue.suse.com/t4288012 and http://openqa.nue.suse.com/t4288013

add debug output:
http://openqa.nue.suse.com/t4290962 and http://openqa.nue.suse.com/t4290963

Actions #27

Updated by hjluo almost 4 years ago

latest offline with fix:
openqa-clone-custom-git-refspec https://github.com/hjluo/os-autoinst-distri-opensuse/tree/ppc64le-spvm http://openqa.nue.suse.com/tests/4287791 -c " --apikey XXX --apisecret XXX" _GROUP=0 -c " --parental-inheritance" MAX_JOB_TIME=3600000

http://openqa.nue.suse.com/t4288142
http://openqa.nue.suse.com/t4288143

with wei.gao's suggested fix:
http://openqa.nue.suse.com/t4288989 http://openqa.nue.suse.com/t4288990
looks like the grub is up and we can do sth there.

Actions #28

Updated by hjluo almost 4 years ago

  • Assignee changed from hjluo to tinawang123

Transfer this ticket to yutao cause it's not just a ticket while it's a new feature that need added to sles15sp3 project.

Actions #29

Updated by tinawang123 almost 4 years ago

  • Status changed from Workable to In Progress
  • % Done changed from 70 to 40
  • Estimated time changed from 10.00 h to 50.00 h
Actions #30

Updated by tinawang123 almost 4 years ago

  • Subject changed from [SLE][Migration][SLE15SP2] Add testcase with recommended modules from SLES 15 SP1 to SP2 on powerVM on spvm backend to [SLE][Migration][SLE15SP3] Add testcase with recommended modules from SLES 15 SP1 to SP2 on powerVM on spvm backend
Actions #32

Updated by tinawang123 almost 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 40 to 100
Actions

Also available in: Atom PDF