Don't use livehandler if no developer looks at it
Next morning, next outage :(
error_log was filling up with errors trying to access the live handler port
and it's no suprise as the live handler was dead (and we have no idea what's
openqa:/home/coolo # strace -p 28474 -f
Process 28474 attached
restart_syscall(<... resuming interrupted call ...>CProcess 28474 detached
#3 Updated by mkittler over 1 year ago
https://github.com/os-autoinst/openQA/commit/7a97302b8a42dcaedfb34fd60a04efea0b08bc7c should prevent the immediate problem when the livehandler isn't reachable.
But yes, it would be nice if the worker would only post the upload progress if someone is watching the test. I could just use the existing
has_logviewers for this.
Only problem would be the following sequence of events:
- Nobody is watching the job (eg. the developer closed the tab).
- The job is paused due to assert_screen timeout.
- The developer opens the tab again. The upload progress hasn't been posted by the worker so the needle editor is not offered although the latest screenshot would be ready.
Not sure how to solve this in an elegant way. Actually I wanted to keep the worker as much out of it as possible. The problem is that the worker is responsible for uploading the test artifacts and hence only knows when the latest screenshot is ready.
One the other side, what would be the big benefit from saving that post call? It is only a small extra cost on top of uploading the artifacts. And now that should be actually true because shouldn't be endlessly trying the same post again and again in the error case.
#4 Updated by coolo over 1 year ago
Your commit does not limit the problem well enough - because you still pile up apache workers waiting for the backend to
And I don't care too much about developers closing tabs - as soon as one developer looked at it, it's fine to
use the live handler. But what we should avoid is jobs that are just the mass of jobs touch unnecessary parts.