action #120429
closedcoordination #121876: [epic] Handle openQA review failures in Yam squad - SLE 15 SP5
Increase timeout when checking if YaST logs can be uploaded
Description
Motivation¶
Three jobs failed:
https://openqa.suse.de/tests/9951916#step/logs_from_installation_system/6
https://openqa.suse.de/tests/9951915
https://openqa.suse.de/tests/9951914
According to the screen: https://openqa.suse.de/tests/9951916#step/logs_from_installation_system/4
It has gotten the response, but it still returned timeout.
We should increase timeout when checking if logs can be uploaded, otherwise we fail for s390x KVM.
Check code: (can_upload_logs](https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/network_utils.pm#L72)
Acceptance criteria¶
AC1: Increase timeout when checking if YaST logs can be uploaded
Updated by JERiveraMoya about 2 years ago
- Subject changed from test fails in logs_from_installation_system - command 'ping -c 1 worker2.oqa.suse.de' timed out to Increase timeout when checking if YaST logs can be uploaded
- Description updated (diff)
- Priority changed from Normal to High
- Target version set to Current
Updated by coolgw about 2 years ago
I suspect /dev/ttysclp0 is still work or not.
Updated by JERiveraMoya about 2 years ago
- Tags deleted (
qe-yast-refinement) - Status changed from New to Workable
Updated by geor about 2 years ago
- Status changed from Workable to In Progress
- Assignee set to geor
Updated by geor about 2 years ago
Updated by geor almost 2 years ago
PR closed.
After various iteration we can concur that increasing the timeout does not resolve the issue.
The culprit for the occasional failure here is script_run
, which does not always capture the return code from the ping command. It can happen that ping has returned successfully in the first 10 seconds, but script_run
will time out in 60 seconds (or whichever timeout value it had).
A new approach is needed here, that will address the sporadic inability of script_run's underlying functionality to get ping's return code.
As discussed in the daily we will not create a new ticket to avoid creating clutter, but will work on the new issue that has arisen in this ticket.
Updated by openqa_review almost 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles12sp4_ltss_pscc_sdk_all_full
https://openqa.suse.de/tests/10219982#step/logs_from_installation_system/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by geor almost 2 years ago
So, it turns out that this is not related to the ping
command, but rather, there are cases where the return code of a shell command is not captured by openqa.
For instance, here we can see that an assert_script_run("ls")
command has timed out, despite the fact that the ls
command has successfully returned.
This issue, where the return code of a shell command is not captured, appears only sporadically and is not easy to debug given it's nature.
Updated by geor almost 2 years ago
Opened subsequent ticket: https://progress.opensuse.org/issues/122608
Updated by openqa_review almost 2 years ago
- Status changed from Closed to Feedback
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles12sp5_pscc_sdk-asmm-contm-lgm-tcm-wsm-pcm_all_full
https://openqa.suse.de/tests/10256078#step/logs_from_installation_system/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by openqa_review almost 2 years ago
- Status changed from Resolved to Feedback
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles12sp5_media_sdk-asmm-contm-lgm-tcm-wsm-pcm_all_full
https://openqa.suse.de/tests/10448596#step/logs_from_installation_system/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.