action #123864
closed
[openqa-in-openqa][sporadic] test fails in start_test due to empty response from o3 size:M
Added by okurz almost 2 years ago.
Updated almost 2 years ago.
Category:
Bugs in existing tests
Description
Observation¶
openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_install+publish@64bit-2G fails in
start_test
due to empty response from internal openQA instance
Test suite description¶
Maintainer: okurz@suse.de Test for installation of openQA itself. To be used with "openqa" distri. Publishes an qcow2 image including the openQA installation ready to run as an appliance.
Reproducible¶
Fails since (at least) Build :TW.16820 (current job)
https://openqa.opensuse.org/tests/3086954#comments shows that this is a sporadic issue.
Expected result¶
Last good: :TW.16819 (or more recent)
Suggestions¶
- Catch the error if no jobs are found from an API query, maybe just
set -u
in bash is enough to fail when using variables?
- Right now the test module is looking for openQA assets to find the "most recent Tumbleweed build" and then query jobs but it can very well happen that assets are created and registered before all openQA jobs are scheduled so this approach can fail. Find a better way in https://github.com/os-autoinst/os-autoinst-distri-openQA/blob/master/tests/osautoinst/start_test.pm to identify the latest job in a scenario and just clone that.
- If the above fails then consider adding a retry with waiting in between
- Ensure we're not waiting too long e.g. avoid waiting a day and spawning more jobs in-between
- The issue doesn't seem to be reproducible reliably
Further details¶
Always latest result in this scenario: latest
Looks like not subsequent tests ran into the issue again (and there were a lot of them). The only failure is https://openqa.opensuse.org/tests/3099784#step/start_test/13 but it ran into a timeout instead of getting an empty reply.
Unfortunately the logs don't give any obvious insights why this API call didn't return any jobs.
- Subject changed from [openqa-in-openqa][sporadic] test fails in start_test due to empty response from internal openQA instance to [openqa-in-openqa][sporadic] test fails in start_test due to empty response from o3
… due to empty response from internal openQA instance
It is actually querying o3 here, not an internally spawned openQA instance. The command is clearly openqa-cli api --host http://openqa.opensuse.org jobs …
. So I'm changing the ticket title.
Note that this test is querying the latest TW build from o3 by querying assets. Then it attempts to find candidate jobs of that build (matching certain criteria). Then this job would be cloned. Here the build could be found but the query for candidate jobs returned no results. Not sure why that would be the case, though. Maybe the asset has already been registered (and thus showed up in the initial query) but jobs haven't been (visibly) scheduled yet (and thus the 2nd query returned no results)? That would be possible. Supposedly retrying the 2nd query would help then (considering https://openqa.opensuse.org/tests/overview?distri=opensuse&version=Tumbleweed&build=20230131&arch=x86_64 shows that jobs for this build have eventually been scheduled).
- Subject changed from [openqa-in-openqa][sporadic] test fails in start_test due to empty response from o3 to [openqa-in-openqa][sporadic] test fails in start_test due to empty response from o3 size:M
- Description updated (diff)
- Status changed from New to Workable
- Status changed from Workable to In Progress
- Assignee set to mkittler
- Due date set to 2023-02-23
Setting due date based on mean cycle time of SUSE QE Tools
- Status changed from In Progress to Feedback
I did a verification run. The PR should be good to merge now.
- Status changed from Feedback to Resolved
We have currently other issues with the openQA-in-openQA test but I haven't seen this issue anymore.
Also available in: Atom
PDF