Project

General

Profile

action #119161

Approval step of qem-bot says incident has failed job in incidents but it looks empty on the dashboard size:M

Added by mkittler 4 months ago. Updated 2 months ago.

Status:
Workable
Priority:
Low
Assignee:
-
Target version:
Start date:
2022-10-21
Due date:
% Done:

0%

Estimated time:

Description

Observation

INFO: Job 1935219 not found 
INFO: Job 1935211 not found 
INFO: Inc 25951 has failed job in incidents
INFO: Inc 25991 does not have any job_settings
INFO: Inc 26161 does not have any aggregates settings

(from https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1202686)

On the dashboard I couldn't see any jobs for that incident (https://dashboard.qam.suse.de/incident/25951) and when checking the openQA database I also haven't gotten any jobs:

openqa=# select id, BUILD from jobs where BUILD like '%25951%';
 id | build 
----+-------
(0 Zeilen)

This looks rather weird. It would be good to have an explanation what's going on and if we'd find a way to make it more obvious.

Acceptance criteria

  • AC1: "has failed job" messages are more specific

Suggestion

  • Distinguish between non-existing and (potentially) ignored jobs. The dashboard does not know if there are only ignored jobs so if possible (jobs for products in development groups are not submitted to the dashboard by the bot). Consider transmitting the amount of all/failed/ignored jobs and remembering that so that we can distinguish
  • Alternative: We could just point to the according gitlab CI step that potentially mentions ignored jobs
  • Make the message as specific as possible, depending on how much the bot actually known in the accept step

Related issues

Related to QA - action #103701: Resubmited incident (ID) with new release request (RR) inherits incident test results from previous RRResolved2021-12-08

Related to QA - action #107923: qem-bot: Ignore not-ok openQA jobs for specific incident based on openQA job comment size:MWorkable

History

#1 Updated by jbaier_cz 4 months ago

Afaik the wording is just bad; the condition in https://github.com/openSUSE/qem-bot/blob/master/openqabot/approver.py#L67 just means that all jobs are not successful (because there is none at all). We can maybe distinguish between "there are failed jobs" and "there are no jobs" in https://github.com/openSUSE/qem-bot/blob/master/openqabot/approver.py#L124

#2 Updated by okurz 4 months ago

  • Priority changed from Normal to Low
  • Target version set to Ready

#3 Updated by mkittler 3 months ago

  • Subject changed from Approval step of qem-bot says incident has failed job in incidents but it looks empty on the dashboard to Approval step of qem-bot says incident has failed job in incidents but it looks empty on the dashboard size:M
  • Description updated (diff)
  • Status changed from New to Workable

#4 Updated by kraih 3 months ago

  • Assignee set to kraih

Maybe i'll practice some Python.

#5 Updated by okurz 3 months ago

  • Related to action #103701: Resubmited incident (ID) with new release request (RR) inherits incident test results from previous RR added

#6 Updated by okurz 3 months ago

Please see #107923#note-37 about what looks like a related issue. Or is it #103701 all over again?

#7 Updated by kraih 3 months ago

  • Assignee deleted (kraih)

Putting the ticket back into the queue for now. Will pick it up again later if nobody else wants to work on it.

#8 Updated by jbaier_cz 2 months ago

  • Related to action #107923: qem-bot: Ignore not-ok openQA jobs for specific incident based on openQA job comment size:M added

#9 Updated by jbaier_cz 2 months ago

Some of that was also targeted by https://github.com/openSUSE/qem-bot/pull/84, so this might be actually already solved.

#10 Updated by okurz 2 months ago

  • Target version changed from Ready to future

Also available in: Atom PDF