Project

General

Profile

Actions

action #155743

open

OBSRSync fails to sync openSUSE:Factory:PowerPC:ToTest (was: WARNINGs: failed is 452.00 in Munin - minion Minion Jobs on o3)

Added by livdywan 10 months ago. Updated 3 months ago.

Status:
Blocked
Priority:
Low
Assignee:
Category:
-
Target version:
QA (public, currently private due to #173521) - Tools - Next
Start date:
2024-02-21
Due date:
% Done:

0%

Estimated time:

Description

Observation

Several emails with the subject Munin - minion Minion Jobs and content like this:

opensuse.org :: openqa.opensuse.org :: Minion Jobs - see https://openqa.opensuse.org/minion/jobs?state=failed
    WARNINGs: failed is 452.00 (outside range [:400]).

Looking at https://openqa.opensuse.org/minion/jobs?state=failed a lot of obs_run_run jobs fail reveals failed jobs as recent as 2024-02-11T10:08:17.307669Z:

---
args:
- project: openSUSE:Factory:PowerPC:ToTest
  url: https://api.opensuse.org/public/build/openSUSE:Factory:PowerPC:ToTest/_result?package=000product
attempts: 1
children: []
created: 2024-02-11T10:06:07.856414Z
delayed: 2024-02-11T10:06:07.856414Z
expires: ~
finished: 2024-02-11T10:08:17.307669Z
id: 3412364
lax: 0
notes:
  gru_id: 19905665
  project_lock: 1
parents: []
priority: 100
queue: default
result:
  code: 512
  message: |-
    openSUSE:Factory:PowerPC:ToTest/base/ exit code: 1 (1 failures total so far)
    openSUSE:Factory:PowerPC:ToTest/microos/ exit code: 1 (2 failures total so far)
retried: ~
retries: 0
started: 2024-02-11T10:06:07.858866Z
state: failed
task: obs_rsync_run
time: 2024-02-21T12:07:01.731854Z
worker: 1952

and

---
args:
- project: openSUSE:Factory:LegacyX86:ToTest
  url: https://api.opensuse.org/public/build/openSUSE:Factory:LegacyX86:ToTest/_result?package=000product
attempts: 1
children: []
created: 2024-02-09T13:33:44.131117Z
delayed: 2024-02-09T13:33:44.131117Z
expires: ~
finished: 2024-02-09T13:35:39.515968Z
id: 3407299
lax: 0
notes:
  gru_id: 19902081
  project_lock: 1
parents: []
priority: 100
queue: default
result: 'Job terminated unexpectedly (exit code: 0, signal: 15)'
retried: ~
retries: 0
started: 2024-02-09T13:33:44.133221Z
state: failed
task: obs_rsync_run
time: 2024-02-21T12:07:01.731854Z
worker: 1950

as well as

---
args:
- project: openSUSE:Leap:15.6:ToTest
  url: https://api.opensuse.org/public/build/openSUSE:Leap:15.6:ToTest/_result?package=000product
attempts: 1
children: []
created: 2024-02-09T01:21:47.455035Z
delayed: 2024-02-09T01:21:47.455035Z
expires: ~
finished: 2024-02-09T01:25:08.329909Z
id: 3404260
lax: 0
notes:
  gru_id: 19899816
  project_lock: 1
parents: []
priority: 100
queue: default
result:
  code: 256
  message: No message
retried: ~
retries: 0
started: 2024-02-09T01:21:47.456660Z
state: failed
task: obs_rsync_run
time: 2024-02-21T12:07:01.731854Z
worker: 1950

Suggestions


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure (public) - action #163340: OBSRSync regularily fails minion jobs - nobody cares, tools gets alerted (e.g. "Munin - minion Minion Jobs") size:MResolveddheidler

Actions
Actions #1

Updated by livdywan 10 months ago

  • Status changed from New to In Progress
  • Assignee set to livdywan

I'll take a look

Actions #2

Updated by livdywan 10 months ago

  • Description updated (diff)
  • Status changed from In Progress to Feedback

Maybe those issues have already been addressed. I cleaned up the dashboard and asked people in eng-testing to check those failures.

Actions #3

Updated by okurz 10 months ago

  • Due date set to 2024-03-06
Actions #4

Updated by livdywan 10 months ago

livdywan wrote in #note-2:

Maybe those issues have already been addressed. I cleaned up the dashboard and asked people in eng-testing to check those failures.

For the record there is also these ones. I would assume they don't need to be investigated as they are not recurring and there was only two:

---
args:
- max_screenshot_id: 1750000246
  min_screenshot_id: 1715000247
  screenshots_per_batch: 200000
attempts: 1
children: []
created: 2024-02-20T13:49:06.267911Z
delayed: 2024-02-21T07:50:45.165608Z
expires: 2024-02-22T13:49:06.267911Z
finished: 2024-02-21T11:50:12.654562Z
id: 3437362
lax: 0
notes:
  gru_id: 19925144
  signal_handler: Received signal TERM, scheduling retry and releasing locks
parents: []
priority: 0
queue: default
result: Worker went away
retried: 2024-02-21T07:50:45.165608Z
retries: 976
started: 2024-02-21T07:50:45.168416Z
state: failed
task: limit_screenshots
time: 2024-02-21T12:42:43.377631Z
worker: 1971
Actions #5

Updated by livdywan 10 months ago

livdywan wrote in #note-2:

Maybe those issues have already been addressed. I cleaned up the dashboard and asked people in eng-testing to check those failures.

Also mentioned it on irc://libera.chat/opensuse-factory

Actions #6

Updated by livdywan 10 months ago

openSUSE:Factory:PowerPC:ToTest also failed yesterday and today, apparently being "disabled" and "unpublished", see also https://build.opensuse.org/repositories/openSUSE:Factory:PowerPC:ToTest/

Symptomatically it reminds me of #112871 which boils down to images that just aren't available. Although the error is not exactly the same.

Actions #7

Updated by livdywan 9 months ago

  • Subject changed from WARNINGs: failed is 452.00 in Munin - minion Minion Jobs on o3 to OBSRSync fails to sync openSUSE:Factory:PowerPC:ToTest (was: WARNINGs: failed is 452.00 in Munin - minion Minion Jobs on o3)
  • Status changed from Feedback to Blocked
Actions #8

Updated by okurz 9 months ago

  • Due date deleted (2024-03-06)
  • Priority changed from High to Low
  • Target version changed from Ready to Tools - Next

I consider it highly unlikely the we will receive a response soon so removing due date and such

Actions #9

Updated by nicksinger 5 months ago

  • Copied to action #163340: OBSRSync regularily fails minion jobs - nobody cares, tools gets alerted (e.g. "Munin - minion Minion Jobs") size:M added
Actions #10

Updated by nicksinger 5 months ago

  • Copied to deleted (action #163340: OBSRSync regularily fails minion jobs - nobody cares, tools gets alerted (e.g. "Munin - minion Minion Jobs") size:M)
Actions #11

Updated by nicksinger 5 months ago

  • Related to action #163340: OBSRSync regularily fails minion jobs - nobody cares, tools gets alerted (e.g. "Munin - minion Minion Jobs") size:M added
Actions #12

Updated by livdywan 5 months ago

livdywan wrote in #note-7:

https://bugzilla.opensuse.org/show_bug.cgi?id=1220707

The ticket has an assigne since 2024-03-07 08:03:03 UTC but no concrete response yet.

Actions #13

Updated by livdywan 3 months ago

https://bugzilla.opensuse.org/show_bug.cgi?id=1220707

The ticket has an assigne since 2024-03-07 08:03:03 UTC but no concrete response yet.

No response so far. Added another comment and mentioned the latest failure from today 🤔

Actions

Also available in: Atom PDF