Project

General

Profile

action #80806

openQA Project - coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

coordination #77899: [epic] Extend "auto-review" for failed jobs as well

Extend "auto-review" for failed jobs as well - Generalize openqa-monitor-investigation-candidates to look at more than just one job group

Added by okurz 6 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2020-12-07
Due date:
% Done:

0%

Estimated time:

Description

Motivation

With openqa-monitor-investigation-candidates in place and combinable with the auto-review workflow we can extend the approach to all "non-development" job groups on o3.

Acceptance criteria

  • AC1: openqa-monitor-investigation-candidates yields "non-development" jobs on o3, e.g. all in non-development job groups or parent job groups, potentially excluding more

Related issues

Copied to QA - action #80808: Extend "auto-review" for failed jobs as well - enable same as on o3 but on osdResolved2020-12-07

History

#1 Updated by okurz 6 months ago

  • Copied to action #80808: Extend "auto-review" for failed jobs as well - enable same as on o3 but on osd added

#2 Updated by okurz 6 months ago

  • Status changed from Workable to Feedback
  • Assignee set to okurz

#3 Updated by okurz 6 months ago

  • Due date set to 2020-12-23

self-merged after 9 days now.

  • TODO: Monitor impact on auto-review gitlab CI pipeline

#4 Updated by okurz 6 months ago

  • Status changed from Feedback to Resolved

The update works as expected. https://openqa.opensuse.org/tests/1511805#comments is an interesting case showing

 auto-review wrote 2020-12-17 09:19:32 +0000

Automatic investigation jobs:

    upgrade_Leap_42.3_kde:investigate:retry: https://openqa.opensuse.org/t1511903
    upgrade_Leap_42.3_kde:investigate:last_good_tests:044414de6115c3a1a4f6cb91de0117235b7f144d: https://openqa.opensuse.org/t1511904
    upgrade_Leap_42.3_kde:investigate:last_good_build:20201214: https://openqa.opensuse.org/t1511905
    upgrade_Leap_42.3_kde:investigate:last_good_tests_and_build:044414de6115c3a1a4f6cb91de0117235b7f144d+20201214: https://openqa.opensuse.org/t1511906

profile
geekotest wrote 2020-12-17 09:19:25 +0000 
…

so we have both the gitlab CI pipeline triggering investigation jobs as well as job-done-hooks. Now the plan can be of course that job-done-hooks replace the gitlab CI pipeline. The funny thing is that the first comment has a later timestamp. We could try to avoid the race with posting a comment that kinda "locks" the job. But most jobs in https://gitlab.suse.de/openqa/auto-review/-/jobs look like they do not need to trigger any investigation jobs anymore because likely the job-done-hooks work just fine.

With an sql query we can check which one creates which jobs, e.g. on o3 select jobs.id, result_dir,t_finished from comments,jobs,users where comments.user_id = users.id and comments.job_id = jobs.id and nickname ~ 'geekotest' and text ~ 'Automatic investigation jobs' order by id DESC limit 10; yields

   id    |                                          result_dir                                           |     t_finished
---------+-----------------------------------------------------------------------------------------------+---------------------
 1512591 | 01512591-opensuse-15.1-DVD-Incidents-x86_64-Build:15296:gcc7.1608235055-gnome@64bit-2G        | 2020-12-17 21:08:32
 1512590 | 01512590-opensuse-15.1-DVD-Incidents-x86_64-Build:15296:gcc7.1608235055-cryptlvm@uefi-2G      | 2020-12-17 20:45:32
 1512509 | 01512509-opensuse-15.1-DVD-Incidents-x86_64-Build:15296:gcc7.1608228253-kde@64bit-2G          | 2020-12-17 20:18:17
 1512502 | 01512502-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-cryptlvm@64bit-2G                   | 2020-12-17 19:57:37
 1512501 | 01512501-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-textmode@64bit                      | 2020-12-17 20:04:33
 1512500 | 01512500-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-gnome@64bit-2G                      | 2020-12-17 20:06:39
 1512499 | 01512499-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-install_with_updates_kde@uefi-2G    | 2020-12-17 19:48:11
 1512498 | 01512498-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-install_with_updates_gnome@64bit-2G | 2020-12-17 19:44:38
 1512497 | 01512497-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-gnome@uefi                          | 2020-12-17 20:35:35
 1512496 | 01512496-opensuse-15.1-DVD-Updates-x86_64-Build20201217-5-kde@64bit-2G                        | 2020-12-17 20:55:01
(10 rows)

where for example
https://openqa.opensuse.org/tests/1512591#comments shows that "next & previous" for maintenance jobs are not perfect as a different package submission test job is identified as "last good".

select jobs.id, result_dir,t_finished from comments,jobs,users where comments.user_id = users.id and comments.job_id = jobs.id and nickname is null and text ~ 'Automatic investigation jobs' order by id DESC limit 10; shows

   id    |                                      result_dir                                      |     t_finished
---------+--------------------------------------------------------------------------------------+---------------------
 1511805 | 01511805-opensuse-Tumbleweed-DVD-x86_64-Build20201216-upgrade_Leap_42.3_kde@64bit    | 2020-12-17 09:19:06
 1509483 | 01509483-openqa-Tumbleweed-dev-x86_64-Build:TW.6771-openqa_from_git@64bit-2G         | 2020-12-15 15:42:38
 1509480 | 01509480-opensuse-15.2-DVD-Updates-x86_64-Build20201215-4-gnome@64bit-2G             | 2020-12-15 16:04:13
 1509473 | 01509473-opensuse-15.3-Rescue-CD-x86_64-Build3.38-rescue@64bit-2G                    | 2020-12-15 16:54:13
 1509470 | 01509470-opensuse-15.3-Rescue-CD-aarch64-Build3.38-rescue@USBboot_aarch64            | 2020-12-15 19:04:48
 1509469 | 01509469-opensuse-15.3-KDE-Live-x86_64-Build3.38-kde-live@64bit-2G                   | 2020-12-15 16:54:57
 1509468 | 01509468-opensuse-15.3-KDE-Live-x86_64-Build3.38-kde-live_installation@64bit-2G      | 2020-12-15 16:09:02
 1509467 | 01509467-opensuse-15.3-KDE-Live-x86_64-Build3.38-kde_live_upgrade_leap_42.3@64bit-2G | 2020-12-15 15:45:26
 1509466 | 01509466-opensuse-15.3-KDE-Live-x86_64-Build3.38-kde-live-wayland@64bit_virtio-2G    | 2020-12-15 16:39:36
 1509465 | 01509465-opensuse-15.3-KDE-Live-x86_64-Build3.38-kde_live_upgrade_leap_15.0@64bit    | 2020-12-15 15:44:05
(10 rows)

so the first one being the mentioned race condition and all later being older. With this I see this ticket successfully resolved and I can continue in the parent epic.

#5 Updated by cdywan 6 months ago

  • Due date deleted (2020-12-23)

Deleting due date in the hopes of fixing the magical due date on #77899.

Also available in: Atom PDF