action #39845

Results of tests with very short duration (~<10s) are not displayed

Added by cfconrad over 1 year ago. Updated 4 months ago.

Status:ResolvedStart date:16/08/2018
Priority:NormalDue date:
Assignee:tinita% Done:

100%

Category:Feature requests
Target version:Current Sprint
Difficulty:
Duration:

Description

If the execution of the job takes approximately less then 10s the results are not displayed in the openqa web ui.
When enlarge the execution time with "script_run('sleep 8');" results are displayed.

I noticed this only with the ssh backend (https://github.com/os-autoinst/os-autoinst/pull/1012), which is in development.

Failed job: http://10.86.1.52/tests/36


Related issues

Related to openQA Project - action #58826: Result not rendered in detail view on short (e.g. <10s) t... New 29/10/2019

History

#1 Updated by coolo over 1 year ago

This is most likely because the worker didn't yet see that there is something running at all. Unusual problem :)

#2 Updated by coolo over 1 year ago

  • Subject changed from [tool] Results of tests with very short duration (~<10s) are not displayed to Results of tests with very short duration (~<10s) are not displayed
  • Target version set to Ready

#3 Updated by andriinikitin about 1 year ago

My non-expert investigation leans to conclusion that it is "by design": - start_time is updated when first message is received from worker. And "update status" messages are coming every 10 seconds as defined here https://github.com/os-autoinst/openQA/blob/64ccc82ec49796560ac09d5efa3fe8105a1655fc/lib/OpenQA/Worker/Common.pm#L66

So the quickest/simplest solution may be to somehow send simple message to the WebService "immediately" after job start.

But I would prefer solution when Worker is sending own explicit start and finish times to WebService. E.g it may do it in first/last message or explicit message packets. WebService may collect own timestamp version of 'first/last' messages received, which may be usable e.g. to understand eventual latency.

#4 Updated by okurz 8 months ago

  • Category changed from 132 to Feature requests

#5 Updated by tinita 6 months ago

How can this be reproduced?
How can I create a job that takes less than 10s?

And which part is not displayed in the ui?

#6 Updated by cfconrad 6 months ago

hi,

create a non sense test like https://github.com/cfconrad/os-autoinst-distri-opensuse/blob/sandbox_clemix/tests/clemix/nop.pm

And trigger just this, you will get a result like: http://cfconrad-vm.qa.suse.de/tests/5993
When waiting these 10 seconds before, you get: http://cfconrad-vm.qa.suse.de/tests/5994

#7 Updated by tinita 6 months ago

cfconrad wrote:

create a non sense test like https://github.com/cfconrad/os-autoinst-distri-opensuse/blob/sandbox_clemix/tests/clemix/nop.pm

Ah, I see, i thought I always have to run the "boot_to_desktop" test first, that's why my test took longer in total.

I was able to reproduce it now with your test, thanks!

#8 Updated by tinita 6 months ago

  • Target version changed from Ready to Current Sprint

#9 Updated by tinita 5 months ago

  • Status changed from New to In Progress
  • Assignee set to tinita

#10 Updated by tinita 5 months ago

Like Andrii said, the first status call to isotovideo happens too late, so it doesn't get a response anymore.
As discussed with Sebastian, I am working on a replacement of the status call via socket. Instead it is using a status file that survives the end of isotovideo.

#12 Updated by pvorel 5 months ago

  • Description updated (diff)

#14 Updated by tinita 4 months ago

Current Status:

I created the PRs which fixed the issue by using a status file.
However, one of the tests (t/33-developer_mode.t) was failing sometimes.
The status call we were using before had some side effect (which seems to be a timing thing only, but not sure).

I have now spent a long time debugging this (and learning to know a lot of our code during this), but the reason is still unclear.
We can't merge my PR until this is fixed.

We don't have a plan yet what to do about it.

#15 Updated by tinita 4 months ago

The bug I mentioned is very probably fixed (PR https://github.com/os-autoinst/os-autoinst/pull/1230 still in review).
Then I can rebase my PRs for this issue.

#16 Updated by cdywan 4 months ago

Isn't this actually Low priority? On the other hand gh#os-autoinst/os-autoinst-distri-opensuse#8329 seems to be blocked by it.

#17 Updated by pvorel 4 months ago

  • Priority changed from Low to Normal

cdywan wrote:

Isn't this actually Low priority? On the other hand gh#os-autoinst/os-autoinst-distri-opensuse#8329 seems to be blocked by it.

Yes, please we're waiting for this to be fixed. BTW we might use gh#os-autoinst/os-autoinst-distri-opensuse#8329 to fix very often broken all LTP on o3 (#51743, https://openqa.opensuse.org/tests/1064280#next_previous).

#18 Updated by tinita 4 months ago

We're on it. Sorry, it was blocked very long by a bug in os-autoinst that needed to be fixed first.
Second, it introduces a new way of communication between the openQA worker and isotovideo, so both repos were updated and we couldn't merge the second PR before the first was merged.

Second PR is in review and should be merged soon. https://github.com/os-autoinst/openQA/pull/2327

#19 Updated by pvorel 4 months ago

@tinita: thanks a lot for working on it :)

#21 Updated by tinita 4 months ago

  • Status changed from In Progress to Feedback
  • % Done changed from 0 to 100

#22 Updated by tinita 4 months ago

cfconrad it was deployed to https://openqa.opensuse.org/, can you test?

#23 Updated by cfconrad 4 months ago

hi @tinita, I run it in my own instance with latest openqa installed.
Looks good, nice!
http://cfconrad-vm.qa.suse.de/tests/6136
EDIT
http://cfconrad-vm.qa.suse.de/tests/6141 <= real test run

Regarding your hint, I took a look to the details page during run. And I had the attached intermediate state.
Don't know if this is something which should be covered as well.
http://imagebin.nue.suse.com/2476

Do you need some test run on openqa as well?

#24 Updated by tinita 4 months ago

@cfconrad That in the intermediate state the short tests don't show is an additional issue. During working on this issue I couldn't figure out why it's happening.
Could you open a new issue for that? Thanks!

#25 Updated by cfconrad 4 months ago

  • Related to action #58826: Result not rendered in detail view on short (e.g. <10s) test-modules, if job is still running added

#27 Updated by tinita 4 months ago

  • Status changed from Feedback to Resolved

Thanks!

Also available in: Atom PDF