Project

General

Profile

Actions

action #99123

closed

ssh based backends can run into timeout if ssh connection is stuck

Added by okurz about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Feature requests
Target version:
Start date:
2021-09-23
Due date:
2021-10-14
% Done:

0%

Estimated time:

Description

Observation

From https://suse.slack.com/archives/C02CANHLANP/p1632408138493900

There are also a lot of jobs failed on bootloader for PowerPC: https://openqa.suse.de/tests/7200264#step/bootloader_start/3

this job ran into the default openQA 2h timeout. Excerpt from log:

[2021-09-23T05:35:13.884 CEST] [debug] <<< backend::baseclass::run_ssh(cmd="! lssyscfg -m redcurrant -r lpar --filter 'lpar_ids=8' -F state | grep -i 'not activated' -q", password="SECRET", username="hscroot", wantarray=0, keep_open=0, hostname="powerhmc1.arch.suse.de")
[2021-09-23T05:35:13.885 CEST] [debug] <<< backend::baseclass::new_ssh_connection(wantarray=0, hostname="powerhmc1.arch.suse.de", keep_open=0, blocking=1, password="SECRET", username="hscroot")
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":39057"
      after 39145 requests (39145 known processed) with 0 events remaining.
[2021-09-23T07:28:47.185 CEST] [debug] backend got TERM

so something did not properly timeout within 2h, could it be the lssyscfg command?

Suggestions

I suggest to improve the ssh command to not be stuck for 2h but timeout after a reasonable time. That would be a start

Actions

Also available in: Atom PDF