action #122308: Handle invalid openQA job references in qem-dashboard size:M - QA (public) - openSUSE Project Management Tool

Actions

action #122308

closed

coordination #99303: [saga][epic] Future improvements for SUSE Maintenance QA workflows with fully automated testing, approval and release

Handle invalid openQA job references in qem-dashboard size:M

Added by okurz over 2 years ago. Updated about 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

jbaier_cz

Target version:

openQA Project (public) - Ready

Start date:

2022-12-21

Due date:

% Done:

Estimated time:

Description

Motivation¶

See #97118#note-10. Looking into https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1301182 for the most recent run of "approve" we found more problems:

2022-12-21 13:34:16 INFO     Job 1967173 not found 
2022-12-21 13:34:16 INFO     Job 1967169 not found 
2022-12-21 13:34:16 INFO     Found failed, not-ignored job 57268 for incident 27251
2022-12-21 13:34:16 INFO     Inc 27251 has at least one failed job in aggregate tests
2022-12-21 13:34:16 INFO     Found failed, not-ignored job 1967179 for incident 27252

so it looks like there are "jobs" 57268 and 1967179 which are not valid openQA jobs from openqa.suse.de. But those "jobs" block the approval. So what are those? Regardless they should be handled accordingly. If those are openQA job references in the database then we should likely crosscheck all openQA job ids and whenever blocking approval check if they actually exist in the live openQA database and delete (or at least ignore) otherwise. It looks like this kind of ID is either an incident_openqa_settings ID or an update_openqa_settings ID but not an openQA job ID. However, that makes me quite confused about my understanding of the code base. In particular, it means the comment-lookup feature I've once introduced cannot actually work because it isn't using an openQA job ID (the is_job_marked_acceptable_for_incident function is basically broken if that's correct). The log message should also be improved to state what kind of ID is logged there because "job" is highly ambiguous. The code should also have a comment where JobAggr is defined what the job_id is.

Acceptance criteria¶

AC1: The message "Found failed, not-ignored job …" refers to actual openQA jobs

Suggestions¶

See how the message is currently written in https://github.com/openSUSE/qem-bot/blob/2aac660ef36c9584ce56ab4e08c4705371d4dc02/openqabot/approver.py#L148
Also see https://github.com/openSUSE/qem-dashboard/blob/main/migrations/dashboard.sql#L53
Write a (failing) unit test that refers to actual openQA jobs from both incident and aggregate tests (both because the problem might be that the code already works for incident tests but for aggregate tests we might refer to the wrong so far)
We assume that in https://github.com/openSUSE/qem-bot/blob/2aac660ef36c9584ce56ab4e08c4705371d4dc02/tests/test_approve.py#L434 we could check for the log message for a failed job but it is likely not 20005 from https://github.com/openSUSE/qem-bot/blob/2aac660ef36c9584ce56ab4e08c4705371d4dc02/tests/test_approve.py#L427 but another number refererring to a "real openQA job". In case of https://github.com/openSUSE/qem-bot/blob/2aac660ef36c9584ce56ab4e08c4705371d4dc02/tests/test_approve.py#L160 the message might be already correct if we would add a test asserting that "job ID 100" is expected to show up so the suggestion is:
- Extend https://github.com/openSUSE/qem-bot/blob/2aac660ef36c9584ce56ab4e08c4705371d4dc02/tests/test_approve.py#L181 to check for a message like Found failed, not-ignored job 100
- Extend https://github.com/openSUSE/qem-bot/blob/2aac660ef36c9584ce56ab4e08c4705371d4dc02/tests/test_approve.py#L434 to check for a message like Found failed, not-ignored job ? with ? to be filled by a proper number but likely not 20005
Somehow change the code so that we have the necessary information available in the JobAggr class or something

Related issues 2 (1 open — 1 closed)

Actions

Project

General

Profile

QA (public)

Tags

Custom queries

action #122308

Handle invalid openQA job references in qem-dashboard size:M

Motivation¶

Acceptance criteria¶

Suggestions¶

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by mkittler over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by jbaier_cz over 2 years ago

Updated by jbaier_cz over 2 years ago

Updated by openqa_review over 2 years ago

Updated by livdywan over 2 years ago

Updated by jbaier_cz over 2 years ago

Updated by okurz over 2 years ago

Updated by jbaier_cz over 2 years ago

Updated by jbaier_cz about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by okurz about 2 years ago