action #178324: [sporadic] svirt s390x tests sometimes time out while syncing assets auto_review:"LIBSSH2_ERROR_TIMEOUT[\s\S]*rsync":retry size:S - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #178324

closed

coordination #176337: [saga][epic] Stable os-autoinst backends with stable command execution (no mistyping)

coordination #125708: [epic] Future ideas for more stable non-qemu backends

[sporadic] svirt s390x tests sometimes time out while syncing assets auto_review:"LIBSSH2_ERROR_TIMEOUT[\s\S]*rsync":retry size:S

Added by mkittler 2 months ago. Updated about 2 months ago.

Status:

Resolved

Priority:

High

Assignee:

mkittler

Category:

Regressions/Crashes

Target version:

Ready

Start date:

2025-02-10

Due date:

% Done:

Estimated time:

Tags:

osd, s390x, reactive work

Description

Observation¶

After #176868 was resolved the timeout for syncing assets is enforced correctly when it expires:

[2025-03-04T01:57:33.064193Z] [debug] [pid:34725] Using existing SSH connection (key:hostname=s390zl12.oqa.prg2.suse.org,username=root,port=22)
[2025-03-04T02:12:35.031481Z] [debug] [pid:34725] [run_ssh_cmd(rsync --timeout='150' --stats -av '/var/lib/openqa/share/factory/hdd/sle-15-SP3-s390x-5.3.18-150300.268.1.gd2bdf5f-Server-DVD-Incidents-Kernel-KOTD@s390x-kvm-with-ltp.qcow2' '/var/lib/libvirt/images//sle-15-SP3-s390x-5.3.18-150300.268.1.gd2bdf5f-Server-DVD-Incidents-Kernel-KOTD@s390x-kvm-with-ltp.qcow2')] stdout:
  sending incremental file list
  sle-15-SP3-s390x-5.3.18-150300.268.1.gd2bdf5f-Server-DVD-Incidents-Kernel-KOTD@s390x-kvm-with-ltp.qcow2
…
  Time out waiting for data (-9 LIBSSH2_ERROR_TIMEOUT) at /usr/lib/perl5/vendor_perl/5.26.1/x86_64-linux-thread-multi/Net/SSH2.pm line 51.
    Net::SSH2::die_with_error(Net::SSH2=SCALAR(0x562b7a3cd8c8)) called at /usr/lib/os-autoinst/backend/baseclass.pm line 1328
    backend::baseclass::run_ssh_cmd(backend::svirt=HASH(0x562b7a84d618), "rsync --timeout='150' --stats -av '/var/lib/openqa/share/fact"..., "username", "root", "hostname", "s390zl12.oqa.prg2.suse.org", "password", "Nots3cr3t-\@3-vt", ...) called at /usr/lib/os-autoinst/consoles/sshVirtsh.pm line 674
    consoles::sshVirtsh::run_cmd(consoles::sshVirtsh=HASH(0x562b79c875c8), "rsync --timeout='150' --stats -av '/var/lib/openqa/share/fact"..., "timeout", 900) called at /usr/lib/os-autoinst/consoles/sshVirtsh.pm line 396

The underlying problem that this rsync command can take very long (or even gets stuck) hasn't been resolved, though. The problem affected multiple jobs (see #176868#note-17) but more recent jobs look good again.

We need to figure out whether the download is really that slow or whether the SSH connection is for some reason going stale. In the last case it would perhaps help to retry the download with a fresh connection.

Steps to reproduce¶

Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
call openqa-query-for-job-label poo#178324

Suggestions¶

Look into https://openqa.suse.de/tests/16929919#next_previous and consider cloning e.g. 100 jobs latest job
Maybe it helps to increase the timeout from 15 minutes to e.g. 25 minutes by setting SVIRT_ASSET_DOWNLOAD_TIMEOUT_M or changing the default in os-autoinst. Note that before this timeout was enforced jobs typically ran into the overall job timeout (see #176076) so this is actually unlikely to help.
Checkout https://openqa.suse.de/tests/overview?modules=bootloader_zkvm&modules_result=failed for problematic jobs (not all of them are about the same issue!)
Adjust autoreview regex as needed
Try adding debug flags to rsync and store it in a file other than os-autoinst-log.txt

Related issues 3 (1 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #178324

[sporadic] svirt s390x tests sometimes time out while syncing assets auto_review:"LIBSSH2_ERROR_TIMEOUT[\s\S]*rsync":retry size:S

Observation¶

Steps to reproduce¶

Suggestions¶

Updated by mkittler 2 months ago

Updated by mkittler 2 months ago

Updated by livdywan 2 months ago · Edited

Updated by MDoucha 2 months ago

Updated by livdywan 2 months ago

Updated by okurz 2 months ago

Updated by mkittler 2 months ago · Edited

Updated by openqa_review 2 months ago

Updated by mkittler about 2 months ago

Updated by mkittler about 2 months ago

Updated by livdywan about 2 months ago · Edited

Updated by livdywan about 2 months ago

Updated by mkittler about 2 months ago · Edited

Updated by mkittler about 2 months ago

Updated by mkittler about 2 months ago

Updated by okurz about 2 months ago