Project

General

Profile

Actions

action #175464

closed

coordination #102906: [saga][epic] Increased stability of tests with less "known failures", known incompletes handled automatically within openQA

coordination #175515: [epic] incomplete jobs with "Failed to find an available port: Address already in use"

jobs incomplete with auto_review:"setup failure: isotovideo can not be started"

Added by rainerkoenig 16 days ago. Updated 15 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2025-01-15
Due date:
% Done:

0%

Estimated time:

Description

Observation

Jobs are failing with setup failure: isotovideo can not be started on OSD e.g. https://openqa.suse.de/tests/16454025 after rsync is done.

Same symptoms on o3 e.g. https://openqa.opensuse.org/tests/4774219 after assets synced successfully.

https://openqa.opensuse.org/tests/4774741 and many others on o3 incomplete with "setup failure: isotovideo can not be started". No more details that seem to be related in autoinst-log.txt nor worker-log.txt

possible regression since 2025-01-13. From /var/log/zypp/history on w26

2025-01-13 09:49:42|install|os-autoinst|4.6.1736759998.a4f72cc-lp156.2000.1|x86_64||devel_openQA|5eea1b72f67fa131e63ac48ecaff2191352a5b3acfe6faf02c82013148ef164f|
2025-01-14 16:17:37|install|os-autoinst|4.6.1736869520.3d40ba7-lp156.2001.1|x86_64||devel_openQA|aa6827c564e410e9893e2bff91270229378cf6d1f882520923782ded1ede257b|

with the only commit included "855fbb60 (okurz/fix/diag_explain) t: Fix hidden output of 'diag explain'" which has changes only in "t/"

openQA-worker maybe? Last good version from https://openqa.opensuse.org/tests/4769715 seems to be crosschecking with /var/log/zypp/history "2025-01-13 15:59:48|install|openQA-worker|4.6.1736782755.0771fde7-lp156.7510.1|x86_64||devel_openQA|e0194139cae23cfdcb5b869bb92962b3106caf041cba9585752756416647cc30|". diff

6f71c5c20 (okurz/feature/openqa_gru, feature/openqa_gru) systemd: Fix premature kill of openqa-gru background processes
5742c2302 Bump eslint-config-prettier from 9.1.0 to 10.0.1
03a921907 Avoid calling `is_running` unnecessarily in `kill`

maybe ReadWriteProcess?

Suggestions

  • Identify recent changes relevant to the failure and mitigate as a first step
  • These are seen at least on grenache and worker24

Rollback steps

  • DONE Re-enable openQA-auto-update

Related issues 6 (0 open6 closed)

Related to openQA Project (public) - action #170209: [sporadic] auto_review:"Failed to find an available port: Address already in use":retry, produces incomplete jobs on OSD, multiple machines size:MResolvedmkittler2024-11-25

Actions
Related to openQA Infrastructure (public) - action #175473: OpenQA Jobs test - Incomplete jobs (not restarted) of last 24h alert SaltResolvedokurz2024-12-19

Actions
Has duplicate openQA Project (public) - action #175470: jobs incomplete with auto_review:"setup failure: isotovideo can not be started"Rejectedokurz2025-01-15

Actions
Has duplicate openQA Infrastructure (public) - action #175494: [openQA][worker][ipmi] isotovideo can not be startedRejected2025-01-15

Actions
Copied to openQA Project (public) - action #175482: jobs incomplete with auto_review:"setup failure: isotovideo can not be started" - why did no tests prevent this to be deployed on both o3+osd?Rejectedokurz2025-01-15

Actions
Copied to openQA Project (public) - action #175518: Conduct "lessons learned" with Five Why analysis for "jobs incomplete with setup failure: isotovideo can not be started" size:SResolvedlivdywan2025-01-24

Actions
Actions

Also available in: Atom PDF