action #152569
Updated by livdywan about 1 year ago
## Observation
When investigating #152560 we noticed that there are also a lot of *restarted* incomplete jobs like this one:
https://openqa.suse.de/tests/13062217
```
Reason: backend died: Error connecting to VNC server <unreal6.qe.nue2.suse.org:5901>: IO::Socket::INET: connect: Connection refused
```
Apparently there is an auto_clone_regex feature that will restart a job directly in openQA if the reason matches a certain regex.
But it doesn't make sense to restart the job thousands of times. I couldn't even find the original job (haven't tried the recursion feature yet).
In total I could find over 17k jobs with that error about `unreal6.qe.nue2.suse.org` since mid november.
A symptom of having such huge restart/clone-chains is:
```
Dec 04 14:39:53 openqa openqa-gru[6326]: Deep recursion on subroutine "OpenQA::Schema::Result::Jobs::related_scheduled_product_id" at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 2016.
```
## Acceptance Criteria
* **AC1**: Incomplete jobs are restarted up to n times at most (configurable)
## Suggestions
* Implement a cap/limit on the automatic restarting of incomplete jobs
* Search for `auto_clone_regex` in the code repository to find the relevant starting point
* Have a look into avoiding the deep recursion as well