action #18016
closed[sles][migration][s390x] find proper way of handling image creation for migration on zKVM
0%
Description
Observation¶
openQA test in scenario sle-12-SP3-Server-DVD-s390x-migration_zdup_offline_sle12sp2_allpatterns_zkvm@zkvm fails in
setup_zdup
Following error can be seen in log https://openqa.suse.de/tests/838567/file/autoinst-log.txt
04:28:54.1894 Debug: /var/lib/openqa/cache/openqa.suse.de/tests/sle/tests/installation/setup_zdup.pm:22 called opensusebasetest::wait_boot
04:28:54.1895 20233 <<< testapi::select_console(testapi_console='x11')
/usr/lib/os-autoinst/consoles/vnc_base.pm:57:{
'password' => 'nots3cr3t',
'hostname' => '10.161.145.3',
'port' => 5901
}
04:28:56.1957 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:28:57.1970 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:28:58.1982 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:28:59.1997 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:29:00.2010 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:29:01.2022 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:29:02.2034 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
04:29:03.2047 20234 Error connecting to host : IO::Socket::INET: connect: No route to host
DIE Can't call method "blocking" on an undefined value at /usr/lib/os-autoinst/consoles/VNC.pm line 864.
Reproducible¶
Fails since (at least) Build 0297 (current job)
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
Files
Updated by okurz@suse.de almost 8 years ago
action #18016: [sles][migration][s390x] test failes in setup_zdup because
zkvm fails to select x11 console https://progress.opensuse.org/issues/18016
[…]
https://openqa.suse.de/tests/838567
would that require an additional reset_consoles
here? isn't that the common
cause of "incomplete" when it should rather be caught in the test and fail
with a helpful error message?
@mgriessmeier, can you try to put in a testapi::record_info
call if this
happens and ensure there is no crash file written so that the test aborts with
failed and not incomplete
Updated by mgriessmeier almost 8 years ago
- Status changed from New to In Progress
- Assignee set to mgriessmeier
okurz@suse.de wrote:
action #18016: [sles][migration][s390x] test failes in setup_zdup because
zkvm fails to select x11 console https://progress.opensuse.org/issues/18016
[…]
https://openqa.suse.de/tests/838567
The problem is, that the image was not created on the right worker - I'm doing this right now and will upload the new image after that
For the future:
- make sure that every image which you want to create with
PUBLISH_HDD_1
is running onWORKER_CLASS=zkvm-imag
e as well as the corresponding upgrade job (also needs to haveWORKER_CLASS=zkvm-image
would that require an additional
reset_consoles
here? isn't that the common
cause of "incomplete" when it should rather be caught in the test and fail
with a helpful error message?@mgriessmeier, can you try to put in a
testapi::record_info
call if this
happens and ensure there is no crash file written so that the test aborts with
failed and not incomplete
I'll put it on my todo list
Updated by mgriessmeier almost 8 years ago
- Re-Added this image created by the correct worker
- Changed all Migration tests in test-development to use WORKER_CLASS zkvm-image Should be working with the next build
Updated by qmsu almost 8 years ago
I see the changes, thanks.
I will check the results of Migration tests in test-development on next build to confirm it works.
Actually we need prepare more s390x hdd images for zdup_offline/online migration tests (i.e. sle12sp1+sdk, sle12sp1+ha+geo, ... sle12sp2+sdk, sle12sp2+ha+geo, etc).
So would you please send me the parameters you posted the job to create this sle12sp2 hdd image? Then I can generate all required images by myself.
Thanks.
Updated by mgriessmeier almost 8 years ago
Hi,
So the general approach would be to clone the corresponding job (e.g. sle12sp1+sdk) from openqa.suse.de to openqa.suse.de and add the PUBLISH_HDD_1 variable
I did it like this:
/usr/share/openqa/script/clone_job.pl --host https://openqa.suse.de --from https://openqa.suse.de $JOB_ID INSTALLONLY=1 WORKER_CLASS=zkvm-image PUBLISH_HDD_1=$HDD_IMAGE_NAME _GROUP=0
NOTES:
Using INSTALLONLY=1
is enough for the creation of the image, no need for consoletests
Using WORKER_CLASS=zkvm-images
is mandatory because otherwise the IPs are not matching (will hopefully be fixed in the future)
Using _GROUP=0
is highly recommended, because it ensures that the creation job will not pollute any existing job group
Updated by mgriessmeier almost 8 years ago
- Subject changed from [sles][migration][s390x] test failes in setup_zdup because zkvm fails to select x11 console to [sles][migration][s390x] find proper way of handling image creation for migration on zKVM
- Category changed from Bugs in existing tests to Infrastructure
Changed subject - since the original ticket was caused by this - but we should track it
So for now, all the zKVM guests use static ip adresses, that's why we need a dedicated workerclass for it to ensure that the created image can be booted correctly
This is bad in several points:
- we can only run one migration at one time
- we need to ensure that the image is always created on the correct worker
Suggestion:
Use a proper dhcp setup on s390pb to avoid this issue
=> I already created a ticket to infra@suse.de for this:
https://infra.nue.suse.com/Ticket/Display.html?id=66714
Let's use this ticket for tracking this
Updated by mgriessmeier almost 8 years ago
- Blocks action #13216: [sles][functional][s390x] Run extratest on s390x added
Updated by mgriessmeier over 7 years ago
- Assignee deleted (
mgriessmeier)
not working on this right now, image-creation is working fine, if the image gets created on the right worker, which is handled pretty well by all people at the moment
Unassigning for now, feel free to ask if you plan to work on this
Updated by okurz over 7 years ago
- Status changed from In Progress to Feedback
- Assignee set to mgriessmeier
I put a friendly bump in that infra ticket.
@mgriessmeier I assume you have no problem tracking that ticket in "feedback" status now as you also now gschlotter personally.
Updated by okurz over 7 years ago
- Status changed from Feedback to In Progress
- Assignee changed from mgriessmeier to okurz
Updated by okurz over 7 years ago
https://openqa.suse.de/tests/1081373#step/patch_before_migration/54 running good so far on o.s.d on a different worker, but this costs us 6 minutes of useless waiting :/
Updated by okurz over 7 years ago
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3356 for the enhancement, merged, synced to osd.
verification job on osd triggered: https://openqa.suse.de/tests/1081424#live, waiting
EDIT: Failed because couldn't find http://openqa.suse.de/assets/repo/SLE-12-SP3-SERVER-POOL-s390x-Build0473-Media1/ . Seems the repo is already cleaned up. IMHO too much effort to make it work for SLE12SP3 right now. Let's just assume it works and continue, ok?
I guess the next step would be to get rid of the zkvm-images worker class in all job schedules.
Updated by okurz over 7 years ago
@mgriessmeier: Do we still need https://infra.nue.suse.com/SelfService/Display.html?id=66714 ?
Updated by mgriessmeier over 7 years ago
okurz wrote:
@mgriessmeier: Do we still need https://infra.nue.suse.com/SelfService/Display.html?id=66714 ?
nope - I commented in the ticket and suggested to close it
Updated by okurz over 7 years ago
waiting for riafarov and me to rework the templates at first for sle15, then we can adapt this step as well
Updated by mgriessmeier over 7 years ago
all occurences of zkvm-image got replaced to use the machine 'zkvm'
see attached dump.diff
see also PR for removing the worker_class from the workers.ini:
https://gitlab.suse.de/openqa/salt-pillars-openqa/merge_requests/51
Updated by okurz over 7 years ago
- Status changed from In Progress to Resolved
MR merged. I also did not find any left over references of zkvm-image(s) in neither os-autoinst nor our tests so we should be done here.
Updated by pvorel almost 5 years ago
I consider
multiple calls of save_svirt_pty as a bug: it slows down testing,
see LTP tests on s390x (svirt backend):
https://openqa.suse.de/tests/3766791#step/boot_ltp/9
BTW: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9290