action #50237
open
[sle] await_install - longer timeout when MAX_JOB_TIME defined NOT ONLY for aarch64
Added by whdu about 5 years ago.
Updated almost 3 years ago.
Description
Now the code in installation/await_install.pm
is:
# aarch64 can be particularily slow depending on the hardware
$timeout *= 2 if check_var('ARCH', 'aarch64') && get_var('MAX_JOB_TIME');
I recommend that it should applied to all conditions not limited to aarch64, because the slowness happened in a lot of situations (eg. prepare image registered with proxy SCC in local development environment)
What is your opinion? I need some input.
that's a good idea for all platforms.
- Category set to Spike/Research
I would need to see links that supports that hypothesis.
The timeout is there for a reason, which is discovering bugs whose effect is slowing down the SUT.
There are also SCC related bugs that causes a timeout: https://bugzilla.suse.com/show_bug.cgi?id=1123963
There could be cases were increasing the timeout can make more sense like on encrypted scenarios, but in general, any case of increasing timeout needs consensus with different teams and release managers to confirm that it is not caused by a bug.
And, if it is a bug, a workaround needs to be implemented, which requires a bug ticket and the use of record_soft_fail.
- Subject changed from [sle][security][sle15sp1] await_install - longer timeout when MAX_JOB_TIME defined NOT ONLY for aarch64 to [sle] await_install - longer timeout when MAX_JOB_TIME defined NOT ONLY for aarch64
- Assignee deleted (
whdu)
SLindoMansilla wrote:
...
There could be cases were increasing the timeout can make more sense like on encrypted scenarios, but in general, any case of increasing timeout needs consensus with different teams and release managers to confirm that it is not caused by a bug.
...
Yes, so I think we should get more inputs before making this change.
Did we get recent system performance degradation? From my perspective it's fine to bump timeout, but in case it became an issue we should investigate why it happened, as worked before for quite a while.
riafarov wrote:
Did we get recent system performance degradation? From my perspective it's fine to bump timeout, but in case it became an issue we should investigate why it happened, as worked before for quite a while.
As I described, it happened on a slow network environment. Especially when preparing image with my own openQA instance for development purpose, and registered via proxy SCC (it means the system will get packages from openqa.suse.de/assets/
)
- Priority changed from Normal to Low
Also available in: Atom
PDF