Project

General

Profile

Actions

action #45191

closed

developer mode: error message just when clicking "Cancel job"

Added by okurz about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2018-11-22
Due date:
% Done:

0%

Estimated time:

Description

So far the user gets a not so nice error message that the connection to os-autoinst has been lost when the test finishes.


Related issues 2 (0 open2 closed)

Related to openQA Project (public) - action #39227: Handle the job being finished more nicely in developer modeResolvedmkittler2018-08-06

Actions
Related to openQA Project (public) - action #57707: isotovideo fails to terminate cleanly, message "isotovideo: unable to inform websocket clients about stopping command server: Request timeout", regression from 4cd4af2bResolvedmkittler2019-10-04

Actions
Actions #1

Updated by okurz about 6 years ago

  • Copied from action #44249: developer mode: "Stop timeout" - like in the old interactive mode :) added
Actions #2

Updated by mkittler about 6 years ago

  • Related to action #39227: Handle the job being finished more nicely in developer mode added
Actions #3

Updated by mkittler about 6 years ago

  • Copied from deleted (action #44249: developer mode: "Stop timeout" - like in the old interactive mode :))
Actions #4

Updated by mkittler about 6 years ago

  • Description updated (diff)
  • Assignee set to mkittler
  • Target version set to Current Sprint
Actions #5

Updated by mkittler almost 6 years ago

Even with my previous idea this turns out to be hard to be implemented. So far I'm unable to prevent the command server from being interrupted until it informs the web socket clients.

Maybe I can also just delay showing the error message in the front-end. That would be a not nice solution but not sure whether messing with os-autoinst's IPC code is worth it.

Actions #6

Updated by mkittler almost 6 years ago

The code in isotovideo to terminate (and eventually kill) its subprocesses is only one way those subprocesses are terminated/killed. The worker-side also attempts to terminate/kill those subprocesses. Apparently we or better mudler re-implemented his own container here. That is why I had no success with extending the livetime of the command server to send some last command.

When working on that I come to the conclusion that we really need more documentation about the whole openQA architecture (not only specific packages and methods). It occupied me quite a while to reverse-engineer/understand what's happening here.

I find it also weird to have isotovideo itself taking care about its subprocess and terminating them also on the worker-side. Additionally, the use of Mojo::IOLoop::ReadWriteProcess::Session and Mojo::IOLoop::ReadWriteProcess::Container is kind of subtitle considering its impact/implications.


To solve this issue I could adjust the worker so it doesn't try to terminate/kill isotovideo. Instead it sends a command to the command server which will then informs web socket clients and stops. The worker could still attempt to terminate/kill isotovideo if stopping via the command server didn't work. That way I would not have to touch any of the session/container code and just let that be the fallback.

Actions #7

Updated by mkittler almost 6 years ago

  • Status changed from New to Feedback

Before implementing the 2nd approach (and possibly scraping it again) I'd like to have some feedback.

Actions #8

Updated by mkittler almost 6 years ago

WIP branches for 2nd approach:

Right now I'm stuck because the HTTP request from the worker to os-autoinst command server does not work yet. Maybe @kraih can help when he's no longer sick.

Actions #9

Updated by mkittler almost 6 years ago

  • Status changed from Feedback to In Progress

It should work now:

Actions #10

Updated by mkittler almost 6 years ago

  • Status changed from In Progress to Resolved

PRs are merged so the error should be gone as soon as everything is deployed.

Actions #11

Updated by okurz about 5 years ago

  • Related to action #57707: isotovideo fails to terminate cleanly, message "isotovideo: unable to inform websocket clients about stopping command server: Request timeout", regression from 4cd4af2b added
Actions

Also available in: Atom PDF