action #81835
closed`/dev/sshserial` is broken on generalhw backend
0%
Description
Observation¶
openQA test in scenario opensuse-Tumbleweed-JeOS-for-RPi-aarch64-jeos-containers@RPi3 fails in
prepare_firstboot
/dev/sshserial
is broken on generalhw backend, so all Raspberry Pi tests are red.
Test suite description¶
JeOS as container host. Test container runtimes (podman and docker) and related tools
Reproducible¶
Fails since (at least) Build 20210105 (current job)
Expected result¶
Last good: 20201228 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by okurz over 3 years ago
- Target version set to future
Unfortunately I do not have a good idea what could have broken this. I asume you have the best chances to fix this with access to Raspberry Pi runners.
Updated by ggardet_arm over 3 years ago
Latest working log:
[0mGOT GO
[37m[2020-12-29T18:25:54.776 CET] [debug] Snapshots are not supported
[0m[1;34m[2020-12-29T18:25:54.784 CET] [debug] ||| starting prepare_firstboot tests/jeos/prepare_firstboot.pm
[0m[2020-12-29T18:25:54.787 CET] [debug] tests/jeos/prepare_firstboot.pm:36 called opensusebasetest::select_serial_terminal -> lib/opensusebasetest.pm:1240 called testapi::select_console
[2020-12-29T18:25:54.788 CET] [debug] <<< testapi::select_console(testapi_console="root-ssh")
/usr/lib/os-autoinst/consoles/vnc_base.pm:62:{
"port" => 45377,
"hostname" => "localhost",
"ikvm" => 0
}
[37m[2020-12-29T18:25:55.341 CET] [debug] Connected to Xvnc - PID 25526
[0micewm PID is 25530
[37m[2020-12-29T18:25:56.359 CET] [debug] Wait for SSH on host 192.168.0.54 (timeout: 240)
[0mxterm PID is 25714
[2020-12-29T18:27:18.455 CET] [debug] <<< backend::baseclass::start_ssh_serial(username="root", password="SECRET", hostname="192.168.0.54")
[2020-12-29T18:27:18.455 CET] [debug] <<< backend::baseclass::new_ssh_connection(hostname="192.168.0.54", username="root", password="SECRET")
[37m[2020-12-29T18:27:18.603 CET] [debug] SSH connection to root@192.168.0.54 established
[0m[37m[2020-12-29T18:27:19.045 CET] [debug] ssh xterm vt: grabbing serial console
[0m[37m[2020-12-29T18:27:19.098 CET] [debug] led state 0 0 0 -261
[0m[37m[2020-12-29T18:27:19.113 CET] [debug] activate_console, console: root-ssh, type: ssh
VS current log:
[0mGOT GO
[37m[2021-01-07T09:33:08.399 CET] [debug] Snapshots are not supported
[0m[1;34m[2021-01-07T09:33:08.408 CET] [debug] ||| starting prepare_firstboot tests/jeos/prepare_firstboot.pm
[0m[2021-01-07T09:33:08.411 CET] [debug] tests/jeos/prepare_firstboot.pm:36 called opensusebasetest::select_serial_terminal -> lib/opensusebasetest.pm:1242 called testapi::select_console
[2021-01-07T09:33:08.412 CET] [debug] <<< testapi::select_console(testapi_console="root-serial-ssh")
[37m[2021-01-07T09:33:08.414 CET] [debug] Connecting SSH serial console for root@192.168.0.54
[0m[2021-01-07T09:33:08.415 CET] [debug] <<< backend::baseclass::new_ssh_connection(password="SECRET", hostname="192.168.0.54", username="root")
[37m[2021-01-07T09:33:11.560 CET] [debug] Could not connect to root@192.168.0.54, Retrying after some seconds...
[0m[37m[2021-01-07T09:33:24.690 CET] [debug] Could not connect to root@192.168.0.54, Retrying after some seconds...
[0m[37m[2021-01-07T09:33:37.800 CET] [debug] Could not connect to root@192.168.0.54, Retrying after some seconds...
[0m[37m[2021-01-07T09:33:50.930 CET] [debug] Could not connect to root@192.168.0.54, Retrying after some seconds...
[0m[37m[2021-01-07T09:34:04.040 CET] [debug] Could not connect to root@192.168.0.54, Retrying after some seconds...
[0m[33m[2021-01-07T09:34:14.050 CET] [info] ::: basetest::runtest: # Test died: Error connecting to <root@192.168.0.54>: No route to host at /usr/lib/os-autoinst/testapi.pm line 1701.
[0m[2021-01-07T09:34:14.051 CET] [debug] lib/opensusebasetest.pm:1329 called opensusebasetest::select_log_console -> lib/opensusebasetest.pm:450 called testapi::select_console
[2021-01-07T09:34:14.051 CET] [debug] <<< testapi::select_console(testapi_console="log-console", timeout=180)
/usr/lib/os-autoinst/consoles/vnc_base.pm:62:{
"hostname" => "localhost",
"port" => 54677,
"ikvm" => 0
}
[37m[2021-01-07T09:34:14.309 CET] [debug] Connected to Xvnc - PID 17529
[0micewm PID is 17533
[37m[2021-01-07T09:34:15.327 CET] [debug] Wait for SSH on host 192.168.0.54 (timeout: 240)
[0mxterm PID is 17539
[37m[2021-01-07T09:34:34.417 CET] [debug] led state 0 0 0 -261
[0m[37m[2021-01-07T09:34:34.432 CET] [debug] activate_console, console: log-console, type: ssh
Updated by ggardet_arm over 3 years ago
So, it seems the problem comes from: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/11625
Updated by MDoucha over 3 years ago
ggardet_arm wrote:
So, it seems the problem comes from: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/11625
The log says that SSH can't connect to the SUT at all.
You can try swapping $self->select_serial_terminal;
for select_console('root-ssh');
to restore the original behavior here:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/jeos/prepare_firstboot.pm#L36
By I seriously doubt that it'll work. You'll just get the same error because the SUT appears to be unreachable via network in the first place.
Updated by ggardet_arm over 3 years ago
MDoucha wrote:
ggardet_arm wrote:
So, it seems the problem comes from: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/11625
The log says that SSH can't connect to the SUT at all.
This is expected at this point as we just switched on the SUT. The previous behavior was waiting for the SUT appears on Network, with a timeout.
You can try swapping
$self->select_serial_terminal;
forselect_console('root-ssh');
to restore the original behavior here:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/jeos/prepare_firstboot.pm#L36By I seriously doubt that it'll work. You'll just get the same error because the SUT appears to be unreachable via network in the first place.
It seems you recreated something which existed before, with a slightly different behavior. Your patch expect the SUT to be already reachable, which is not the case yet, here.
Updated by MDoucha over 3 years ago
ggardet_arm wrote:
This is expected at this point as we just switched on the SUT. The previous behavior was waiting for the SUT appears on Network, with a timeout.
It seems you recreated something which existed before, with a slightly different behavior. Your patch expect the SUT to be already reachable, which is not the case yet, here.
It's not safe to call select_serial_terminal
before the SUT is fully booted. There are multiple console backends in there which require access to interactive shell on previous console to be activated.
Updated by ggardet_arm over 3 years ago
MDoucha wrote:
ggardet_arm wrote:
This is expected at this point as we just switched on the SUT. The previous behavior was waiting for the SUT appears on Network, with a timeout.
It seems you recreated something which existed before, with a slightly different behavior. Your patch expect the SUT to be already reachable, which is not the case yet, here.
It's not safe to call
select_serial_terminal
before the SUT is fully booted. There are multiple console backends in there which require access to interactive shell on previous console to be activated.
It worked perfectly fine before your patch.
All the code needed to wait for the SUT is there for more than a year 0 and used every day.
Updated by MDoucha over 3 years ago
ggardet_arm wrote:
It worked perfectly fine before your patch.
All the code needed to wait for the SUT is there for more than a year [0] and used every day.
Because your code was relying on special behavior of the root-ssh
console. Select it explicitly with select_console('root-ssh');
and wait for login prompt. Then you can safely call select_serial_terminal
if you want a better console than VNC.
Updated by ggardet_arm over 3 years ago
- Status changed from New to Resolved
- Assignee set to ggardet_arm