action #9580
closedaction #9570: Boot to Snapshot
Boot to snapshot after upgrade and then rollback
100%
Description
A little more complicated than testing boot to snapshot - as we want to check that we can rollback
tasks¶
- come up with a way to match SP1/GA needles (the rollback target) within the SP2 test (the rollback source)
- @waitfor image generation jobs working again: fix ppc64le issues
Updated by RBrownSUSE about 9 years ago
- Assignee set to dmaiocchi
Assigning to Dario as he's volunteered
Updated by RBrownSUSE about 9 years ago
- Checklist item changed from to [ ] SLE, [ ] TW, [ ] Leap
- Target version changed from 154 to 162
Updated by dmaiocchi almost 9 years ago
for this task at moment i have made 2 tests.
1 boot_to_snapshot
2 snapper_rollback.
In this way all is more scalable, and we test 2 different things. We can in this way test boot_snapshot_ without a migration situation.
Updated by dmaiocchi almost 9 years ago
1) done . boot_to_snapshot merged. test run without problems in production.
2) snapper rollback -> WIP
2a) adapt grub_test for booting on snapshot before migration if Upgrade is selected
2b) add rollback_test in main.pm and work on the pm itself
Updated by okurz almost 9 years ago
- Target version changed from 162 to Milestone 1
Updated by dmaiocchi almost 9 years ago
- Status changed from New to Feedback
- % Done changed from 50 to 90
pr created. feedback phase
Updated by RBrownSUSE almost 9 years ago
- Status changed from Feedback to In Progress
- % Done changed from 90 to 70
Please document the changes you want to see made to the production system in order to test this.
In the call today you said 'just add Upgrade=1' but we have lots of tests with upgrade=1 already set and they are not running this test
http://openqa.suse.de/tests/342503
I have gone back to the pull request and cannot find any mention of what you want changed there either... I'm happy to setup whatever jobs you need, but I need to be told what you need.
Updated by dmaiocchi almost 9 years ago
H Richard,
so in the main.pm
if ((snapper_is_applicable) && get_var("BOOT_TO_SNAPSHOT")) {
loadtest "installation/boot_into_snapshot.pm";
if (get_var("UPGRADE")) {
loadtest "installation/snapper_rollback.pm";
for make the tests "snapper_rollback" variable needed to 1 set : "boot_to_snapshot" and "upgrade", and the snapper_is_applicable function has to return 1
sub snapper_is_applicable() {
my $fs = get_var("FILESYSTEM", 'btrfs');
return ($fs eq "btrfs" && get_var("HDDSIZEGB", 10) > 10);
}
Best
Updated by RBrownSUSE almost 9 years ago
So are you proposing that I create about a dozen all new test cases for all the migration scenarios to also test boot to snapshot and rollback?
Or are you suggesting I add boot to snapshot testing to all the migration scenarios?
I'm still not sure what you're expecting me to do...
Updated by dmaiocchi almost 9 years ago
Well, i didn't know either, when i got the task, in which Releases/os it will be run.
I was thinking that the boot into snapshot has to be tested in all the version that make a migration and support btrfs & snapper.
and the snapper rollback, should be tested when a system make a migration with btrfs and snapper.
I'm speaking here about the real test case.
I know that it will take some times additional to run, maybe 4-5 minutes are going to be added from the "production" testing in openqa. Well this is another situation, and another problem. Honestly i don't know even all the matrix scenarios for openqa.
We can add this maybe for one migration, sp1-> SP2 at begin and see, before adding this test to whole production
Updated by RBrownSUSE almost 9 years ago
Part of the task is an opportunity for you to define answers to all of those questions :)
Honestly i don't know even all the matrix scenarios for openqa.
You have the SLE 12 PRD Document so you should know what is intended to be supported, you can see openqa.suse.de and all the source for os-autoinst-distri-opensuse so you can understand the current matrix. What more information do you require?
We can add this maybe for one migration, sp1-> SP2 at begin and see, before adding this test to whole production
Sounds like a good idea, will start that way - but I'm still worried about the suggestion you don't have a full picture..so please answer my above question too :)
Updated by dmaiocchi almost 9 years ago
Please define a testing strategy according with that i can try to modify the test.
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/1270
Thank you in advance.
Scenarios that we consider :
a) do rollaback after migration, then continue testing the migrated system.
--> Problem: (the system before migration with the snapshosht, backup system is not set)
--> Advantage: not really difficult to maintain.
b) Have a scenario after a regular migration test <- this approach takes longer, but has the added benefit of testing the migration of ga/sp1 to sp2 and fully validating the distribution, then rolling back that exact image and fully re-validating the distribution on the earlier version
[didn't understand this. could be like c) with a snapshot image]
--> Problem:
--> Benifits:
c) create an image for system with snapshots as backup. and do the distro-openqa testing on that.
d)....
--> DECISION:::
Updated by maritawerner almost 9 years ago
Hi, I will give input here once I find time.
Updated by dmaiocchi almost 9 years ago
i agree that a 'detailed test' is a nice solution.
a 'detailed test', boot to a snapshot of a migrated system and
rollback to before the migration and confirm the migrated system is a
valid sle12sp1 or sle12ga server using all the usual acceptance test
suites
i will list what is not clear for me:
test that the rollback functionality after migration works
properly, so that users can rollback to sle12sp1 or sle12ga after
upgrading to sle12sp2
a) so this mean that the boot_into_snapshot test and the rollback should be enabled after the X11 tests of the migrated system ?
if we have that, then we should rerun the grub2 test, because in there is the process to boot_into_snapshot.
so this could be an example of a job of the matrix:
``
Sles12-sp1.
installation --> after that we are on Sles12-sp2
console
x11
shutdown ..
++ duplicate/recall the grub test ..> enable the boot_into_snapshot # Sles12-sp1
rollback
console # for Sles12-sp1
x11 # for sles12-sp1
obviously i chose sp1, but is only an example of the migration matrix.
b) or we do imagine that? Making images, job-groups, or?
``
thx in advance.
Updated by RBrownSUSE almost 9 years ago
a) so this mean that the boot_into_snapshot test and the rollback should be enabled after the X11 tests of the migrated system ?
Yes, or have the 'rollback test' using a disk image of an already tested migrated system
so this could be an example of a job of the matrix:
Sles12-sp1.
installation --> after that we are on Sles12-sp2
console
x11
shutdown ..
++ duplicate/recall the grub test ..> enable the boot_into_snapshot # Sles12-sp1
console # for Sles12-sp1
x11
Yes, but why are you thinking that through so detailed? the main.pm takes care of most of that for you, all we need is a test after the migration of a sle12sp2 machine to roll it back to sle12sp1 or ga and then run all the tests we'd normally run on sles
That's the task..make a test that rolls back a migrated system back to the version it had before it migrated...
b) or we do imagine that?
Do we imagine what?
Updated by dmaiocchi almost 9 years ago
This was a typo, from my side, sorry.
b) or we do imagine that? --> or how do we imagine this task?
I'm thinking detailed now, for avoid implementations errors for the future.
Like that i wrote the snapshot test thinking that i should be on the middle.
BTw, i was trying to test, or to load duplicate tests on the main.pm for a job, this is not working.
with simple test-cases.
with functions:
``
only an example:
unless (load_applicationstests() || load_slenkins_tests()) {
load_rescuecd_tests();
load_consoletests();
load_x11tests();
load_consoletests();
load_x11tests();
``
Can you double-check that? I'm not sure.
I would say that to make an image, is the better solution, and clean, but this would implicate more images to maintain, another test for each migration and so on.
Well we should take a decision on that, because if we take an image "rollbacked", then i have to code it different speaking from the logic itself , from the case that "we run all the test sequentially"
Updated by RBrownSUSE almost 9 years ago
only an example:
unless (load_applicationstests() || load_slenkins_tests()) {
load_rescuecd_tests();
load_consoletests();
load_x11tests();
load_consoletests();
load_x11tests();
It's never a good idea to repeat the same test modules within the same scenario. The WebUI can never handle it, so you get a very incomplete picture.
We've worked around that in the past by symlinking some tests so you effectively have two test modules with two different names but the same code. But that does not scale for this situation
But, I do not think that is a problem because:
I would say that to make an image, is the better solution, and clean, but this would implicate more images to maintain, another test for each migration and so on.
I totally agree that creating images is a better solution
Images to maintain is not a problem - openQA takes care of that. We will just have it make the image as a result of each migration test. The Gru will tidy them up.
And the typical desire of 'run as much as possible in parallel' does not apply here - you cannot rollback until migration is completed. And if the migration doesn't complete the image will not be made, so the rollback test will automatically be cancelled.
Updated by dmaiocchi almost 9 years ago
ok thx a lot of the feedback.
i want just summarize to get a clearly picture/testplan for me. (will be detailed, because i want to track the tasks/subtask of this task)
Admin side of openqa.suse.de:
- After the migration-job is done, make an image that contain the migrated systems (sles12-sp1-sp2-migration.qcow2 for example)
- Create a new job that we'll be called rollback-migration-12sp2-sp1, as example. ( this are multiples jobs)
From my side:
Testsing/code side:(sles12-sp1-sp2-migration.qcow2
- create new vars.json file to readapt workflow of test.. (installation is not needed with a (sles12-sp1-sp2-migration.qcow2qcow)
- create a qcow.
- readapt the main.pm to enable the run the grub/boot_into_snapshot test at the boot time of the qcow.
- rewrite the snaphost/snapper rollback test for this scenario.
- enable only this case the snapper_rollback_test_
- make the console and x11 tests after.
Updated by dmaiocchi over 8 years ago
- Status changed from In Progress to Feedback
- % Done changed from 70 to 100
Updated by dmaiocchi over 8 years ago
According to latest results of production,
i see that the new needle match only in a few cases.
here for example, the name sles-sp1 cause that the string is moved to right, so the word "update" is truncated.
https://openqa.suse.de/tests/382291/modules/grub_test_snapshot/steps/48
Open for suggestions about how we can fix that.
Updated by RBrownSUSE over 8 years ago
Richards Rule #1 of Needling
"Make the Needle as small as it needs to be"
Make your needle smaller, so it works when truncated, just like we talked about last week..
Updated by okurz over 8 years ago
- Description updated (diff)
- Status changed from Feedback to In Progress
- Assignee deleted (
dmaiocchi)
"boot_to_snapshot" on x86_64 works pretty good, e.g. see https://openqa.suse.de/tests/396527
issues we see on x86_64 are product issues, e.g. in sle-12-SP2-Server-DVD-x86_64-Build1537-rollback_migration_offline_sle12@64bit https://openqa.suse.de/tests/396187 jobs failing because of incomplete gnome menu but also because we try to apply SP2 needles to the GA build, e.g. in gnome_terminal test which does not work.
ppc is currently not triggered at all because the job creating the image after migration fails, e.g. see in https://openqa.suse.de/tests/overview?distri=sle&version=12-SP2&build=1537&groupid=25 that the job is user_cancelled and also in many predecessors so we can not do much there right now.
Updated by okurz over 8 years ago
- Checklist item changed from [ ] SLE, [ ] TW, [ ] Leap to [x] SLE, [ ] TW, [ ] Leap
- Priority changed from High to Low
also works on ppc64le: https://openqa.suse.de/tests/489879 albeit flaky.
considered done for SLE.
Missing to enable for TW+Leap.
Updated by okurz over 8 years ago
- Copied to action #12964: [qe-core][functional][migration] Boot to snapshot after upgrade and then rollback added
Updated by okurz over 8 years ago
- Checklist item changed from [x] SLE, [ ] TW, [ ] Leap to [x] SLE
- Status changed from In Progress to Resolved
- Assignee set to okurz
- Priority changed from Low to High
openSUSE enablemement tracked in #12964
Updated by okurz over 8 years ago
- Blocks action #13156: os-autoinst: Add support to easily switch VERSION during a test run added