Project

General

Profile

Actions

action #109974

open

QA (public) - coordination #99303: [saga][epic] Future improvements for SUSE Maintenance QA workflows with fully automated testing, approval and release

QA (public) - coordination #109644: [epic] Future improvements for qem-bot

qem-bot/dashboard - mixed old and new incidents - potential future ideas

Added by okurz over 2 years ago. Updated 7 months ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Observation

Maintenance sometimes re-uses old incidents instead of creating new ones for package which leads to mixed results in dashboard :(

see: https://suse.slack.com/archives/C02D16TCP99/p1648721562205869

So we need workaround/solution for this corner case

See also https://github.com/openSUSE/qem-dashboard/issues/61

Originally brought up by coolo in
https://suse.slack.com/archives/C02D16TCP99/p1638283633141300

I just noticed a rather alarming issue: http://dashboard.qam.suse.de/incident/20989 talks about 43 passed, 1 failed jobs for the incident

Problems

Acceptance criteria

  • AC1: It is possible to reuse incidents and qem-bot can still approve releated release requests

Suggestions

  • Read the qem-dashboard schema to understand where important settings are stored in https://github.com/openSUSE/qem-dashboard/ , in particular https://github.com/openSUSE/qem-dashboard/blob/main/migrations/dashboard.sql
  • Try to document a proper manual process as "Workaround" and for us to understand
  • As first feature just delete all aggregate openQA data in qem-dashboard older than configurable, but default 90 days
  • Optional: Add a manual gitlab CI pipeline trigger to be triggered manually
  • The dashboard can trigger that cleanup when it gets new smelt data and notices an update of the RR (release request)
  • We might need to identify "outdated openQA jobs" by "low openQA job id" or a timestamp. Might be necessary to add that to the qem-dashboard database

Workarounds

  • Ask maintenance to create a new, fresh incident, e.g. by a comment in IBS
  • Detect invalid requests e.g. with outdates results and reject them
  • Manually delete

Something along the lines of

ssh root@qam2.suse.de
machinectl shell postgresql
sudo -u postgres psql dashboard_db
(wreak havok in here)

SELECT update_settings FROM openqa_jobs WHERE update_settings is not NULL AND timestap < NOW() - INTERVAL X
(store update_settings)

DELETE FROM openqa_jobs WHERE update_settings is not NULL AND timestap < NOW() - INTERVAL X
DELETE FROM update_openqa_settings WHERE id in `stored update_settings`

Related issues 2 (1 open1 closed)

Related to QA (public) - action #155206: [qem-bot] re-release update can miss repo and thus not schedule updatesNew2024-02-08

Actions
Copied from openQA Project (public) - action #109310: qem-bot/dashboard - mixed old and new incidents size:MResolvedkraih2022-03-31

Actions
Actions #1

Updated by okurz over 2 years ago

  • Copied from action #109310: qem-bot/dashboard - mixed old and new incidents size:M added
Actions #2

Updated by okurz over 2 years ago

  • Parent task changed from #109641 to #110836
Actions #3

Updated by jbaier_cz 10 months ago

  • Related to action #155206: [qem-bot] re-release update can miss repo and thus not schedule updates added
Actions #4

Updated by okurz 7 months ago

  • Parent task changed from #110836 to #109644
Actions

Also available in: Atom PDF