action #40913
closed
script_output sometimes fail on virtio console
Added by cfconrad over 6 years ago.
Updated over 6 years ago.
Category:
Regressions/Crashes
Description
Observation¶
It seams randomly happen in openQA, that tests fail, cause "cat -" never finish.
Steps to reproduce¶
Observations on openQA
I was able to bring my openQA instance in such a state. I'm actually not sure, if this
is the same problem as we have in osd, but it looks similar. The big different is, that if it happen once,
it happen always for that worker.
What I did so far:
- Start a test which is using virtio console
- restart openQA while the test is running
- run tests again
Problem¶
A call like this:
cat - > /tmp/script8RI3l.sh; echo 8RI3l-$?-
Doesn't get the EOT and so we never reach the prompt again.
Suggestion¶
We need deeper investigations.
Assuming that EOT is handled specially by the terminal and converted into a signal at some early stage, then it may be that the signal is raised before the cat process is ready to receive signals. I'm not sure if we wait long enough for the shell to start cat before sending the data. For the text content this probably doesn't matter because it is just buffered somewhere and eventually gets delivered to cat.
Sending EOT multiple times is dangerous because it might cause the shell to exit IIRC. We need to know when cat has started and is ready receive data, but I don't see a way of doing that.
Such a call is harder to handle even on non virtio consoles, will basically work only if we type everything. But can you just replace it with some other call? To be honest I would expect that main use case for hyphen is when we want to input from file and from keyboard. In this case seems we could simply use echo or if script is too big, just download it from the worker.
@riafarov Yes I think using a temp file and just downloading it, would avoid that problem. And if someone really need "type_command" we could go the echo approach.
@rpalethorpe This seems to be the problem. I played with a delay and the problem didn't appear anymore.
I will remove my reproduce steps, as this didn't turned out to be correct. If I just run the tests in a loop, they fail from time to time...
introducing file download will not always work, because in some cases we need to use script_output to actually setup network :)
- Status changed from New to Resolved
- Category set to Regressions/Crashes
- Assignee set to cfconrad
So you accomplished that, you can assign yourself. Don't be ashamed of your contributions ;)
I thought it doesn't make sense to assign someone to a closed bug. Thanks for pointing that out.
Sure, that is certainly an exception. The normal flow should be like this that someone picks up a ticket whenever when that person starts the development and sets the ticket to "In Progress" together with assigning him-/herself and then just set to "Resolved" whenever done.
Also available in: Atom
PDF