action #36442

action #14818: [EPIC] Interactive mode is an usability disaster

Access to running SUTs for System Developers

Added by szarate over 1 year ago. Updated 4 months ago.

Status:ResolvedStart date:23/05/2018
Priority:NormalDue date:
Assignee:mkittler% Done:

100%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

As a System Developer I want openQA to pause a job before a specific test module is executed
so that I can go step by step through the test module and manually collect logs when the I can reproduce the error.

  • AC1: System Developer is able to specify which module the job has to pause at.
    • AC1.1: A variable can be used to define a pause / developer entry point
    • AC1.2: Scheduled and running jobs offer the possibility to define a pause / developer entry point
    • AC1.3: Jobs in assigned state do not allow the possibility to define a pause / developer entry
  • AC2: System Developer is able to interact with the SUT.
    • AC2.1: System Developer is able to connect to the SUT via vnc.
    • AC2.2: A message is shown in the webUI for a SUT that is waiting for a developer to connect with the instructions
  • AC3: System Developer is able to share information between the SUT and another machine.

Subtasks

action #36454: Move 'Minimal developer mode' to openQA web UIResolvedmkittler

action #36574: Display instructions how to connect when a job is in stat...Resolvedszarate

action #36613: Implement WS connection from web UI to command serverResolvedmkittler

action #37375: Create UI elements for developer sessions and update them...Resolvedmkittler

action #38120: Make developer mode accesible to non-adminsResolvedmkittler


Related issues

Related to openQA Project - action #42677: Keep virtual machines running on demand/failure (was: Don... Rejected 18/10/2018
Related to openQA Project - action #45671: Improve "developer mode" connection hint for non-qemu rem... New 03/01/2019

History

#1 Updated by szarate over 1 year ago

During the sprint planning we agreed to discuss what are the implementation details for this feature.

There is a PoC at https://github.com/os-autoinst/os-autoinst/pull/958 for a proxy to os-autoinst command server, so that Lea can efficiently send openQA instructions to isotovideo via openQA testapi. But lea wants to also be able to interact with the SUT directly (In fact, this is where she's interested at).

PR https://github.com/os-autoinst/openQA/pull/1632 is related to the removal of the current interactive mode.

#2 Updated by szarate over 1 year ago

PO is suggesting to have a WS proxy that allows comunication user <==> webUI < == > command server < == > isotovideo

so first step: have a WS connection from webui to isotovideo commands that fetch the currently running testapi command
for that we need isotovideo changes + WS proxy within webui + webui changes 

#3 Updated by szarate over 1 year ago

  • Description updated (diff)

#4 Updated by szarate over 1 year ago

  • Description updated (diff)

#5 Updated by mkittler over 1 year ago

Today I played with the web socket console (https://github.com/os-autoinst/openQA/pull/1661) to set a test variable when the test is running and then pause autotest if the value of the variable matches the test name.

I noticed that this task isn't that easy. It is quite hard to understand how it works due to the multi-process architecture. My understanding of the different processes so far (please correct me if I'm wrong):

  process:              relevant Perl file(s):    what it does:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
* isotovideo            isotovideo                spawns further processes, IO-loop for passing some commands (main occupation), cleanup
    * backend           baseclass.pm and derived, spawns and handles backend (eg. qemu), can receive commands from isotovideo IO-loop
                        console.pm and derived 
        * qemu
        * videoencoder
    * autotest          autotest.pm, testapi.pm,  determines test order, runs test code and thus calls testapi functions, sends commands
                        console_proxy.pm,         to isotovideo IO-loop (via query_isotovideo)
                        basetest.pm and derived
    * command server    commands.pm               provides GET/POST HTTP routes and ws server, passes commands received via web sockets
                                                  to isotovideo IO-loop

So the variable value I'd like to change must be passed from the command server to autotest. I still couldn't figure out how to do it. The autotest process seems only to be able to send commands, but can not receive. Since it runs test code most of the time, it is not possible to handle any async events here. So I tried to make it simply reload the variables before starting a test but haven't had any success either (maybe my code is just wrong).

#6 Updated by coolo over 1 year ago

No, you can't rely on autotest. autotest is blocking - it's basically driving the execution, everything else is just reacting to it.

It's isotovideo's main loop that is relevant here - it will get a callback from autotest when the current test changes (cmd=>set_current_test). At this point it needs to stop taking commands from autotest and pretend the execution takes very long.

E.g. autotest might call cmd=>backend_type_string or cmd=>check_check - and those will just take a long time as isotovideo is no longer passing it to the backend.

Take your time to understand the architecture properly - and keeping your understanding documented is a good idea.

#7 Updated by szarate over 1 year ago

  • Description updated (diff)

#8 Updated by mkittler over 1 year ago

  • Status changed from New to In Progress
  • Assignee set to mkittler

PRs:

#9 Updated by mkittler over 1 year ago

  • It is now deployed on e212: http://e212.suse.de/tests/12240#live
    (Not the most recent commit, currently rebuilding the package.)
  • There are now 4 different tests related to the developer mode feature:
    • full-stack.t: The regular full stack test connects directly to os-autoinst websocket server to pause and resume a test while it is running. It tests everything via the developer console.
    • 33-developer_mode.t: The developer full stack test connects via the proxy pauses and resumes a test while it is running. It also tests whether the session is locked for other users. It tests mostly via the developer console.
    • 34-developer_mode-unit.t: Unit tests for developer mode related features. Tests database operations, the ws proxy and other methods of the LiveViewHandler.pm controller. The connection to os-autoinst is always mocked.
    • ui/25-developer_mode.t: Tests only the UI-layer (mostly JavaScript code). Everything else (eg. any web socket connections) is mocked.

Still TODO:

  • Cancel the job when the developer session is canceled DONE
    • Additionally, the developer session shouldn't be unregistered so the responsible developer can always be tracked. DONE
    • Loosing the web socket connection to os-autoinst shouldn't cancel the developer session anymore. Let's see how to handle the reconnect to os-autoinst then. DONE
  • Likely fixing bugs and tests.
  • Check the test coverage, add more tests if needed.
  • More tests on staging instance.

#10 Updated by mkittler over 1 year ago

For the actual access we could use noVNC: https://github.com/novnc/noVNC#server-requirements

It uses VNC over web sockets which is supported by QEMU. We could proxy that connection like the ws connection for the status. Not sure whether implementing this in Mojolicious will be efficient enough, though. It would be convenient because this way we could do the authentication as usual. Otherwise an extra access token could be generated and shared though the livehandler daemon of course.

#11 Updated by szarate over 1 year ago

Looks fancy enough (Although I'm not fan of node based apps :P), but from what I could read, it's matter of just embedding app and the passing the websocket path mojolicious would only take care of displaying/serving assets... Or do you see another problem?

#12 Updated by mkittler over 1 year ago

Pausing is implemented and there are also some instructions for accessing via VNC. Access to all kinds of tests is not implemented, though.

I'm unassigning because I suppose working on the other developer mode use cases precedes making this one more fancy.

#13 Updated by mkittler over 1 year ago

  • Assignee deleted (mkittler)

#14 Updated by szarate over 1 year ago

  • Target version changed from Current Sprint to Ready

#15 Updated by okurz over 1 year ago

  • Related to action #42677: Keep virtual machines running on demand/failure (was: Don't just power off virtual machines upon job timeout) added

#16 Updated by mkittler about 1 year ago

  • Related to action #45671: Improve "developer mode" connection hint for non-qemu remote machines added

#17 Updated by okurz 8 months ago

  • Category changed from 132 to Feature requests

#18 Updated by okurz 7 months ago

  • Status changed from In Progress to Resolved
  • Assignee set to mkittler

back to "Workable". I see AC1 and AC2 fulfilled as well as implicitly "AC3: System Developer is able to share information between the SUT and another machine." considering that we have the basic possibility for VNC and other network access to the machines, e.g. one could even share data over ssh and such. I even suggest we are done here. Please reopen with a comment if thinking differently. Assigning to mkittler who did the main recent changes in this topic.

#19 Updated by coolo 4 months ago

  • Target version changed from Ready to Done

Also available in: Atom PDF