action #164613
closedJob with RETRY should not be cloned if it was obsoleted
Description
Check this little job cluster:
- https://openqa.fedoraproject.org/tests/2755002
- https://openqa.fedoraproject.org/tests/2755034
- https://openqa.fedoraproject.org/tests/2755068
What happened there is that 2755002 ran, but while it was running, a new POST was sent that created 2755034. This is because the Fedora packager edited the update and fixed the bug that was causing failures. However, when 2755034 was created and 2755002 was obsoleted, 2755002 also got cloned, because it had RETRY set and had not yet been retried. This resulted in the creation of 2755068, which is considered the most 'current' test run.
Because it was a clone of 2755002, it had the same settings as 2755002, without the new packages that were added when the maintainer edited the update (compare the value of ADVISORY_NVRS_1
in each of the three tests; in 2755002 and 2755068 it has just one package, in 2755034 it has three). This meant it failed. 2755034 passed.
So until I retriggered the tests again just now, the web UI and anything else that looks for the most current result considered the test failed, because they took 2755068 (with the settings reflecting the old state of the update) instead of 2755034 (with the settings reflecting the new state of the update).
I would argue that if a new POST happens, that probably indicates that we want the test created by that POST to be the 'current' test, not a clone of a previously-posted test with RETRY set which was running when the POST happened. So I think we should not clone a job with RETRY set if it ended because was obsoleted by a new POST.
Updated by okurz 5 months ago
https://github.com/os-autoinst/openQA/pull/5797 merged. Trying with consistently not trying a retry on all aborted results: https://github.com/os-autoinst/openQA/pull/5806
Updated by okurz 5 months ago
- Due date deleted (
2024-08-14) - Status changed from Feedback to Resolved
https://github.com/os-autoinst/openQA/pull/5806 merged and also deployed on OSD meanwhile: https://mailman.suse.de/mlarch/SuSE/openqa/2024/openqa.2024.08/msg00000.html
No obvious issues seen right now. I guess it would be a long time to see if there are related problems anyway so resolving.