Project

General

Profile

Actions

action #174601

open

openqa-gru.service journal filled with openqa-trigger-bisect-jobs stack traces

Added by nicksinger 5 days ago. Updated 5 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Regressions/Crashes
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Observation

While looking into https://progress.opensuse.org/issues/174580 and why openqa-gru failed today, I found a lot and repeated stack traces in gru's journal all looking similar to:

Dec 19 13:26:23 openqa openqa-gru[5066]: openqa-clone-job (81 /opt/os-autoinst-scripts/openqa-investigate): (openqa-clone-job --json-output --skip-chained-deps --max-depth 0 --parental-inheritance --within-instance https://openqa.suse.de/tests/16252243 TEST+=:investigate:last_good_tests:7c3e460816d9f4305b288674abdc15d295158b49 _TRIGGER_JOB_DONE_HOOK=1 _GROUP_ID=0 BUILD= CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-opensuse.git#7c3e460816d9f4305b288674abdc15d295158b49 OPENQA_INVESTIGATE_ORIGIN=https://openqa.suse.de/t16252243) stderr: >>>Current job 16252243 will fail, because the repositories for the below updates are unavailable<<<
Dec 19 13:26:23 openqa openqa-gru[5066]: openqa-clone-job (81 /opt/os-autoinst-scripts/openqa-investigate): (openqa-clone-job --json-output --skip-chained-deps --max-depth 0 --parental-inheritance --within-instance https://openqa.suse.de/tests/16252243 TEST+=:investigate:last_good_tests:7c3e460816d9f4305b288674abdc15d295158b49 _TRIGGER_JOB_DONE_HOOK=1 _GROUP_ID=0 BUILD= CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-opensuse.git#7c3e460816d9f4305b288674abdc15d295158b49 OPENQA_INVESTIGATE_ORIGIN=https://openqa.suse.de/t16252243) rc: 255 >>><<<
Dec 19 13:26:24 openqa openqa-gru[5129]: Current job 16252241 will fail, because the repositories for the below updates are unavailable
Dec 19 13:26:24 openqa openqa-gru[5129]: [
Dec 19 13:26:24 openqa openqa-gru[5129]:   "http://download.suse.de/ibs/SUSE:/Maintenance:/36747/SUSE_Updates_SLE-Module-Basesystem_15-SP5_x86_64/",
Dec 19 13:26:24 openqa openqa-gru[5129]: ] at /usr/share/openqa/script/../lib/OpenQA/Script/CloneJobSUSE.pm line 39.
Dec 19 13:26:24 openqa openqa-gru[5072]: Traceback (most recent call last):
Dec 19 13:26:24 openqa openqa-gru[5072]:   File "/opt/os-autoinst-scripts/openqa-trigger-bisect-jobs", line 322, in <module>
Dec 19 13:26:24 openqa openqa-gru[5072]:     main(parse_args())
Dec 19 13:26:24 openqa openqa-gru[5072]:   File "/opt/os-autoinst-scripts/openqa-trigger-bisect-jobs", line 304, in main
Dec 19 13:26:24 openqa openqa-gru[5072]:     args.dry_run,
Dec 19 13:26:24 openqa openqa-gru[5072]:   File "/opt/os-autoinst-scripts/openqa-trigger-bisect-jobs", line 149, in openqa_clone
Dec 19 13:26:24 openqa openqa-gru[5072]:     return call(["openqa-clone-job"] + default_opts + cmds + default_cmds, dry_run)
Dec 19 13:26:24 openqa openqa-gru[5072]:   File "/opt/os-autoinst-scripts/openqa-trigger-bisect-jobs", line 114, in call
Dec 19 13:26:24 openqa openqa-gru[5072]:     (["echo", "Simulating: "] if dry_run else []) + cmds
Dec 19 13:26:24 openqa openqa-gru[5072]:   File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
Dec 19 13:26:24 openqa openqa-gru[5072]:     **kwargs).stdout
Dec 19 13:26:24 openqa openqa-gru[5072]:   File "/usr/lib64/python3.6/subprocess.py", line 438, in run
Dec 19 13:26:24 openqa openqa-gru[5072]:     output=stdout, stderr=stderr)
Dec 19 13:26:24 openqa openqa-gru[5072]: subprocess.CalledProcessError: Command '['openqa-clone-job', '--skip-chained-deps', '--json-output', '--within-instance', 'https://openqa.suse.de/tests/16252241', 'SDK_TEST_REPOS=http://download.suse.de/ibs/SUSE:/Maintenance:/36728/SUSE_Updates_SLE-Module-Development-Tools_15-SP5_x86_64/,http://download.suse.de/ibs/SUSE:/Maintenance:/36797/SUSE_Updates_SLE-Module-Development-Tools_15-SP5_x86_64/,http://download.suse.de/ibs/SUSE:/Maintenance:/36821/SUSE_Updates_SLE-Module-Development-Tools_15-SP5_x86_64/', 'TEST=jeos-containers-podman:investigate:bisect_without_36475', 'OPENQA_INVESTIGATE_ORIGIN=https://openqa.suse.de/tests/16252241', 'MAINT_TEST_REPO=', '_GROUP=0']' returned non-zero exit status 255.
Dec 19 13:26:24 openqa openqa-gru[5158]: openqa-clone-job (81 /opt/os-autoinst-scripts/openqa-investigate): (openqa-clone-job --json-output --skip-chained-deps --max-depth 0 --parental-inheritance --within-instance https://openqa.suse.de/tests/16205104 TEST+=:investigate:last_good_build:20241216-1 _TRIGGER_JOB_DONE_HOOK=1 _GROUP_ID=0 BUILD= OPENQA_INVESTIGATE_ORIGIN=https://openqa.suse.de/t16252243) stderr: >>>Current job 16205104 will fail, because the repositories for the below updates are unavailable<<<
Dec 19 13:26:24 openqa openqa-gru[5158]: openqa-clone-job (81 /opt/os-autoinst-scripts/openqa-investigate): (openqa-clone-job --json-output --skip-chained-deps --max-depth 0 --parental-inheritance --within-instance https://openqa.suse.de/tests/16205104 TEST+=:investigate:last_good_build:20241216-1 _TRIGGER_JOB_DONE_HOOK=1 _GROUP_ID=0 BUILD= OPENQA_INVESTIGATE_ORIGIN=https://openqa.suse.de/t16252243) rc: 255 >>><<<
Dec 19 13:26:26 openqa openqa-gru[5248]: openqa-clone-job (81 /opt/os-autoinst-scripts/openqa-investigate): (openqa-clone-job --json-output --skip-chained-deps --max-depth 0 --parental-inheritance --within-instance https://openqa.suse.de/tests/16205104 TEST+=:investigate:last_good_tests_and_build:7c3e460816d9f4305b288674abdc15d295158b49+20241216-1 _TRIGGER_JOB_DONE_HOOK=1 _GROUP_ID=0 BUILD= CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-opensuse.git#7c3e460816d9f4305b288674abdc15d295158b49 WORKER_CLASS=svirt-xen,openqaw5-xen,zone-cc,region-prg,datacenter-dc7,location-prg2,worker35,cpu-x86_64,cpu-x86_64-v2,cpu-x86_64-v3 OPENQA_INVESTIGATE_ORIGIN=https://openqa.suse.de/t16252243) stderr: >>>Current job 16205104 will fail, because the repositories for the below updates are unavailable<<<

The first time I see this in logs happening is Dec 19 09:55:21 (I assume this is UTC as it is directly from journalctl on OSD). I'm not sure if this is causing the gru-service to fail but it is certainly something to look into.


Related issues 1 (1 open0 closed)

Copied from openQA Infrastructure (public) - action #174580: [FIRING:1] Failed systemd services alertBlockednicksinger2024-11-19

Actions
Actions #1

Updated by nicksinger 5 days ago

  • Copied from action #174580: [FIRING:1] Failed systemd services alert added
Actions #2

Updated by jbaier_cz 5 days ago

Hmm.. is that a combination of investigation jobs which are cloning a failed test and the fail-early feature of CloneJobSUSE which looks if all IBS folders are available and fails the cloning process if not?

Actions

Also available in: Atom PDF