Project

General

Profile

action #109974

QA - coordination #99303: [saga][epic] Future improvements for SUSE Maintenance QA workflows with fully automated testing, approval and release

QA - coordination #110836: [epic] future qem-bot improvements

qem-bot/dashboard - mixed old and new incidents - potential future ideas

Added by okurz 3 months ago. Updated about 2 months ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

Maintenance sometimes re-uses old incidents instead of creating new ones for package which leads to mixed results in dashboard :(

see: https://suse.slack.com/archives/C02D16TCP99/p1648721562205869

So we need workaround/solution for this corner case

See also https://github.com/openSUSE/qem-dashboard/issues/61

Originally brought up by coolo in
https://suse.slack.com/archives/C02D16TCP99/p1638283633141300

I just noticed a rather alarming issue: http://dashboard.qam.suse.de/incident/20989 talks about 43 passed, 1 failed jobs for the incident

Problems

Acceptance criteria

  • AC1: It is possible to reuse incidents and qem-bot can still approve releated release requests

Suggestions

  • Read the qem-dashboard schema to understand where important settings are stored in https://github.com/openSUSE/qem-dashboard/ , in particular https://github.com/openSUSE/qem-dashboard/blob/main/migrations/dashboard.sql
  • Try to document a proper manual process as "Workaround" and for us to understand
  • As first feature just delete all aggregate openQA data in qem-dashboard older than configurable, but default 90 days
  • Optional: Add a manual gitlab CI pipeline trigger to be triggered manually
  • The dashboard can trigger that cleanup when it gets new smelt data and notices an update of the RR (release request)
  • We might need to identify "outdated openQA jobs" by "low openQA job id" or a timestamp. Might be necessary to add that to the qem-dashboard database

Workarounds

  • Ask maintenance to create a new, fresh incident, e.g. by a comment in IBS
  • Detect invalid requests e.g. with outdates results and reject them
  • Manually delete

Something along the lines of

ssh root@qam2.suse.de
machinectl shell postgresql
sudo -u postgres psql dashboard_db
(wreak havok in here)

SELECT update_settings FROM openqa_jobs WHERE update_settings is not NULL AND timestap < NOW() - INTERVAL X
(store update_settings)

DELETE FROM openqa_jobs WHERE update_settings is not NULL AND timestap < NOW() - INTERVAL X
DELETE FROM update_openqa_settings WHERE id in `stored update_settings`

Related issues

Copied from openQA Project - action #109310: qem-bot/dashboard - mixed old and new incidents size:MResolved2022-03-31

History

#1 Updated by okurz 3 months ago

  • Copied from action #109310: qem-bot/dashboard - mixed old and new incidents size:M added

#2 Updated by okurz about 2 months ago

  • Parent task changed from #109641 to #110836

Also available in: Atom PDF