Project

General

Profile

action #109310

Updated by okurz over 2 years ago

## Observation 

 Maintenance sometimes re-uses old incidents instead of creating new ones for package which leads to mixed results in dashboard :( 

 see: https://suse.slack.com/archives/C02D16TCP99/p1648721562205869 

 So we need workaround/solution for this corner case 

 See also https://github.com/openSUSE/qem-dashboard/issues/61 

 Originally brought up by coolo in 
 https://suse.slack.com/archives/C02D16TCP99/p1638283633141300  

 > I just noticed a rather alarming issue: http://dashboard.qam.suse.de/incident/20989 talks about 43 passed, 1 failed jobs for the incident 


 ## Problems 
 * http://dashboard.qam.suse.de/incident/20639 references "208 passed, 4 failed, 12 stopped" and a link to openQA results https://openqa.suse.de/tests/overview?build=%3A20639%3Aopensc but the openQA test results only show 183 passed and 18 soft-failed 
   * -> dashboard should not say "passed" when it means "passed+softfailed" but "ok", see https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Jobs/Constants.pm#L76= 
   * -> Consider using time-fixed links, e.g. https://openqa.suse.de/tests/overview?build=%3A20639%3Aopensc&t=2022-04-01+08%3A53%3A19+%2B0000 
   * -> Ensure that the results are current and correspond to what openQA sees itself (numbers should match) 
   * -> Exclude any results that are outside a "reasonable time range", e.g. http://dashboard.qam.suse.de/blocked for 20639 shows incident results from some months ago, build 2021… 

 ## Acceptance criteria 
 * **AC1:** It is possible to reuse incidents and qem-bot can still approve releated release requests 

 ## Suggestions 
 * Read the qem-dashboard schema to understand where important settings are stored in https://github.com/openSUSE/qem-dashboard/ , in particular https://github.com/openSUSE/qem-dashboard/blob/main/migrations/dashboard.sql 
 * Read the proper manual process as "Workaround" and for us to understand (further down) 
 * Just delete all aggregate openQA data in qem-dashboard older than configurable, but default 90 days 

 ## Workarounds 
 * Ask maintenance to create a new, fresh incident, e.g. by a comment in IBS 
 * Detect invalid requests e.g. with outdates results and reject them 
 * Manually delete 

 Something along the lines of 

 ``` 
 ssh root@qam2.suse.de 
 machinectl shell postgresql 
 sudo -u postgres psql dashboard_db 
 (wreak havok in here) 

 SELECT update_settings FROM openqa_jobs WHERE update_settings is not NULL AND timestamp timestap < NOW() - INTERVAL X 
 (store update_settings) 

 DELETE FROM openqa_jobs WHERE update_settings is not NULL AND timestamp timestap < NOW() - INTERVAL X 
 DELETE FROM update_openqa_settings WHERE id in `stored update_settings`

Back