[tools]'Proper Interactive Mode' - Making openQA gods gift to developers
Right now we have 'Interactive Mode', which is effectively an 'Interactive Needling Mode'
This is great, but it does not solve the needs of developers who want to use openQA to reproduce bugs
That is where this proposed 'Interactive Mode' comes in.
As a developer working on a bug reported by openQA I want to log into a SUT for bug reproduction and log gathering to fix the bug
- a 'breakpoint' can be set over WebUI where the test execution pauses
- the user can log into the SUT remotely when within the same network
- the user can log into the SUT remotely when not in the same network, e.g. o3
- experiment pausing test execution (e.g. does SIGSTOP on the isotovideo process work? maybe the existing interactive mode is good enough?)
- write a guide for non-openQA professionals how to use this to fulfill the user story
- ensure SUTs within the same network can be accessed, e.g. over VNC
- extend access for workers not in the same network
- optional: extend what we use for authentication (see open questions)
user story details¶
openQA has found a bug, and that bug has been reported, and the developer wants to reproduce that bug in order to fix it.
The developer wants to have access to the system under test (be it a VM, or a physical machine).
They want to be able to control the keyboard and the mouse.
They may want to be able to extract files from the system (though arguably this can be accomplished by using scp from the system under test in most cases).
In order to accomplish this I believe we need to have a mechanism, in the WebUI, that a Developer can choose a 'breakpoint' for an openQA test
When os-autoinst reaches that module, we want something similar to the 'Interactive Needling Mode', in that the test execution stops
But unlike the Interactive Needling Mode, we want the SUT to keep on running and not be paused.
Okurz suggests something like sending a sigstop to isotovideo might be sufficient here.
The developer would then be able to use the already visible server and port details to connect to the openQA worker instance in question and control the system under test.
This would be sufficient for a Proof of Concept/first phase for this feature, but obviously would not work in situations where the Workers are not directly reachable by the developers reproducing bugs (eg. openqa.opensuse.org)
In this case we would need the webUI server to be able to act as a proxy, and ideally offer the user the VNC in a nice webUI interface, something like https://github.com/kanaka/noVNC. But I consider this 'phase 2' of this feature, with the general 'breakpoints' and 'Proper Interactive Mode' needing to get in place before we worry about making it all Webby for everyone.
After the interactive session ends the job should probably be aborted, at best with a proper state, e.g. user_cancelled. It's ok for a start to let the job simply fail for any reason, e.g. because it times out itself. In most cases the result of the specific test run used for investigation using remote control is not interesting to us and the system has been tampered with anyway.
- who (e.g. which user group) should be allowed to access the machine?
- okurz: maybe start with only "Operator" and "Admin" even though it conflicts with the user story mentioning the "developer"
- is someone at fedora interested see #12130#note-1
#3 Updated by okurz over 4 years ago
SUSE studio is already offering something in this direction. We can take their approach for motivation (but consider that the source code is closed).
From a technical perspective connecting to the machine and continuing execution for a virtual machine works by using the qemu machine commands so it will be beneficial if we first gather in better documentation what we have and then improve to have a more streamlined GUI approach.
#5 Updated by okurz over 4 years ago
It's high because rbrown is stoked about it :-) But he still set it for "future" milestone, i.e. it's not planned to do it right now. That's kinda conflicting but as multiple persons tell "this is already possible right now" and we don't have a good guide to point bug investigators to I would also see this as high priority.
#7 Updated by dheidler over 4 years ago
The current interactive mode will stop the VM but not qemu (so new vnc clients can connect, but can't do anything).
Most workers allow vnc connections (without auth, but with the -shared option).
Workers with the name openqaworkerX (where X ∈ ℕ) don't accept the connection for some reason.
#8 Updated by okurz about 4 years ago
Based on my recent experiences I wrote down https://progress.opensuse.org/projects/openqatests/wiki#Interactive-investigation which might be useful to keep in mind when working on this issue.
#9 Updated by oholecek almost 4 years ago
- Assignee set to oholecek
Working on this. The idea is that test does not call testapi directly instead it calls testapi wrapper which talks to testapi::server (through socket) doing the actual changes. This way, when test is in waiting, I can unpause the VM and provide interface with available testapi calls - these can not only change the VM, but also be recorded to create proper openQA test.
Regarding the security, I'm tempting to allow this only when controlling openqa webui is running on localhost, or on config-allowed FQDNs - i.e. test developer working on machine she has access to.
#11 Updated by okurz almost 4 years ago
- Priority changed from High to Normal
oholecek, are you really working on this? I guess it's time again to deprioritize a bit. I am pretty sure we can near never really solve this but we should keep this user epic around and link tickets to it with more specific ideas.