action #32968
closedaction #30649: [tools][openqa] Improve performance by using migrations and external snapshots
[kernel][tools] Refactor QEMU backend - Create QEMU process manager and save configuration state
100%
Description
Start moving the configuration of QEMU to a more abstract model where the parameters are generated from an object model. This should allows parameters to be added and removed between QEMU restarts as well as making the configuration more modular. There are too many parameters to create an object model for in a single refactoring (without breaking the small batch sizes principle), so we can split them into static parameters which are just an array of strings like in the current model and dynamic parameters which are stored as Perl objects and are serialised into parameter strings when required. The ultimate goal is to have an object model which completely decouples configuration from how the parameters are passed to QEMU. And possibly after that we could further generalise the object model between backends to allow some configuration options to be shared between backends. However it may not be necessary to go that far.
This ticket is just for creating the manager class with the static parameters.
Updated by EDiGiacinto almost 7 years ago
- Related to action #29419: [tools] MULTINET parameter cause incomplete job added
Updated by coolo almost 7 years ago
- Target version changed from Current Sprint to 448
Updated by rpalethorpe almost 7 years ago
- Status changed from New to Workable
I already created a class which stores the static parameters and have started creating an object model for the block storage parameters. However that could take up the rest of the sprint, so I will finish some jobs on the kernel backlog.
Updated by rpalethorpe over 6 years ago
- Status changed from Workable to In Progress
Updated by rpalethorpe over 6 years ago
I have new block device object model working under various scenarios including multipath. I have implemented saving external snapshots, but still need to implement loading them which requires restarting QEMU.
Updated by sebchlad over 6 years ago
- Related to action #32593: Multiple ttySx consoles for qemu added
Updated by jlausuch over 6 years ago
@rpalethorpe do you think that this manager class will help to be more flexible in multi-nic scenarios? See this comment for needed specs: https://progress.opensuse.org/issues/32959#note-3
Updated by rpalethorpe over 6 years ago
If NICs, switches, routers etc. need to be added and removed during testing then yes, definitely. If not, then it will still be useful for better code organisation, but so far I have not touched networking. It doesn't appear that network devices can be added or removed during a test, or that the VLAN needs to be restarted with QEMU, so there is no state to revert during a snapshot. The current task is to move devices to the object model which have state (i.e. block devices which are added during a snapshot) or are closely related to devices with state.
Updated by jlausuch over 6 years ago
rpalethorpe wrote:
If NICs, switches, routers etc. need to be added and removed during testing then yes, definitely. If not, then it will still be useful for better code organisation, but so far I have not touched networking. It doesn't appear that network devices can be added or removed during a test, or that the VLAN needs to be restarted with QEMU, so there is no state to revert during a snapshot. The current task is to move devices to the object model which have state (i.e. block devices which are added during a snapshot) or are closely related to devices with state.
Ok. Thanks for the clarification. At least, at a first glance it sounds like a solution. For the case I mentioned, we would need the network devices pre-setup before running the test, not during execution. Although that would be a nice feature for fail-over scenarios (maybe future)
Updated by rpalethorpe over 6 years ago
status update: It can now restart QEMU and load an external snapshot: http://rpws.suse.cz/tests/170#
I'm surprised the code which rolls back the consoles after a snapshot has ever worked. It seems that if you activate consoles A then B, take a snapshot then active A, then load the snapshot, console A will still be selected even though console B was active at the time of the snapshot. In fact even if you activate console A, take a snapshot, then active console B, then revert to the snapshot, console B will be selected, although it will get 'reset'. However console A should be active. Probably this has not been noticeable because tests haven't mixed consoles much, but it is becoming more common.
AFAICT I have fixed that problem, but there is another issue. Consoles may have state which needs to be saved and restored during a snapshot. The only instance of this I have thought of so far is the virtio_console, which should save whatever 'unread' data qemu has output at the time of the snapshot. E.g. in the link above all there are a lot of non fatal failures after each snapshot because the test module expects '#', but this has already been consumed before the snapshot was loaded. The same problem will exist with other serial terminal consoles, but I don't think it is a problem for VNC. Note that this is not a problem with restarting QEMU, it is an existing problem in the current version of os-autoinst.
Finally the way we manage the QEMU process has a few race conditions, so I have incorporated some of Ettorie's changes so that we don't just instantly fail when a socket has not been created yet. Also we should probably start using the Mojo process management libraries.
I have a collection of other TODO's as well but at least the concept is fully proven now. It appears that the new method either works with virtio GPU's or QEMU now supports it in all cases as I was able to use snapshots with it enabled.
Updated by rpalethorpe over 6 years ago
I made a PR to make it more visible: https://github.com/os-autoinst/os-autoinst/pull/942
Updated by rpalethorpe over 6 years ago
Implemented snapshots for the virtio console so that the expected output is restored after a snapshot.
Updated by rpalethorpe over 6 years ago
- Status changed from In Progress to Feedback
Waiting for complaints...
Updated by rpalethorpe over 6 years ago
I have added the pflash vars to the OpenSUSE instance. Anton says that I should set the var inside main.pm instead. It should be quite easy to remove the vars if we decided to do that instead.
Updated by szarate over 6 years ago
Please note that we haven't deployed yet :)
Updated by rpalethorpe over 6 years ago
- Related to action #38813: Qemu backend rewrite fallout added
Updated by rpalethorpe over 6 years ago
- Status changed from Feedback to Resolved
Deployed; problems being tracked in fallout thread.