coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

#1

Updated by okurz over 6 years ago

Related to action #13242: WDYT: For every job that does not have a label or bugref, retrigger some times to see if it's sporadic. Like rescheduling on incomplete but on failed added

Actions

Copy link

#2

Updated by okurz over 6 years ago

Related to action #38621: [functional][y] test fails in welcome - "Module is not signed with expected PKCS#7 message" (bsc#1093659) - Use serial exception catching feature from openQA to make sure the jobs reference the bug, e.g. as label added

Actions

Copy link

#3

Updated by okurz over 6 years ago

Related to coordination #13812: [epic][dashboard] openQA Dashboard ideas added

Actions

Copy link

#4

Updated by okurz over 6 years ago

Related to deleted (action #38621: [functional][y] test fails in welcome - "Module is not signed with expected PKCS#7 message" (bsc#1093659) - Use serial exception catching feature from openQA to make sure the jobs reference the bug, e.g. as label)

Actions

Copy link

#5

Updated by nicksinger over 6 years ago

Another idea which could be checked/better reported to the user:

If a crucial component in the "os-autoinst-chain" fails (e.g. xterm for ipmi jobs), openQA could easily report this earlier. As it is right now, the job stalls (hangs as "running") but only shows a black screen. Example: https://openqa.suse.de/tests/1970948 (look for "PermissionError" in the osautoinst-log.txt)

Actions

Copy link

#6

Updated by coolo over 6 years ago

Target version set to future

IMO this is best handled by an automated review from outside. The problem is not so much the detecting the issue, but how to handle it. For some projects/objects you would do a retrigger, for others you would prefer defining a label.

Actions

Copy link

#7

Updated by okurz over 6 years ago

"outside", yes, I agree. Should be outside what is currently defined as "openQA" but it could be that we still call it "the openQA ecosystem" so I guess this issue tracker is still best suited. Some parts we have already covered with the proof-of-concept of detecting known failures in the serial port output.

Actions

Copy link

#8

Updated by coolo over 6 years ago

I don't disagree with the issue tracker - I just don't want a High priority epic in my 'to be sorted' list

Actions

Copy link

#9

Updated by okurz over 6 years ago

Related to action #42446: [qe-core][functional] many opensuse tests fail in desktop_runner or gimp or other modules in what I think is boo#1105691 – can we detect this bug from the journal and track as soft-fail? added

Actions

Copy link

#10

Updated by okurz over 6 years ago

Subject changed from [epic] Detect "known failures" and mark jobs as such to [functional][y][u][epic] Detect "known failures" and mark jobs as such

Trying to bring it forward with help of QSF again…

Actions

Copy link

#11

Updated by okurz over 6 years ago

Related to action #27004: [opensuse][sle][functional][yast][y][hard] yast2 gui modules fail to start in the defined time frame added

Actions

Copy link

#12

Updated by okurz over 6 years ago

Related to deleted (action #27004: [opensuse][sle][functional][yast][y][hard] yast2 gui modules fail to start in the defined time frame)

Actions

Copy link

#13

Updated by okurz over 6 years ago

Blocks action #27004: [opensuse][sle][functional][yast][y][hard] yast2 gui modules fail to start in the defined time frame added

Actions

Copy link

#14

Updated by okurz over 6 years ago

Related to action #40382: Make "ignored" issues more prominent (was: create new state "ignored") added

Actions

Copy link

#15

Updated by okurz over 6 years ago

https://github.com/os-autoinst/os-autoinst/pull/1052 to "Add option to override status of test modules with soft-fail"

Actions

Copy link

#16

Updated by okurz over 6 years ago

Status changed from New to Feedback
Assignee set to okurz

Actions

Copy link

#17

Updated by okurz over 6 years ago

The feature is not working as intended as in https://github.com/os-autoinst/os-autoinst/blob/master/basetest.pm#L286 we overwrite the result again. I am trying to simply remove that method :)

-> https://github.com/os-autoinst/os-autoinst/pull/1062

Also presented my idea to riafarov and we identified one problematic scenario: What if we force the status of a parent job to "softfail"? For now openQA would still trigger the downstream jobs which then most likely should fail because a module in the parent job failed, in the worst case even making the downstream jobs incomplete because the HDD image was never published properly. We should avoid this though.

Actions

Copy link

#18

Updated by okurz over 6 years ago

Related to action #43784: [functional][y][sporadic] test fails in yast2_snapper now reproducibly not exiting the "show differences" screen added

Actions

Copy link

#19

Updated by szarate over 6 years ago

Related to action #45011: Allow detection of known failures at the autoinst-log.txt added

Actions

Copy link

#20

Updated by szarate over 6 years ago

I see that one of the suggestions on this ticket was exactly what poo#45011 is about :)

Actions

Copy link

#21

Updated by agraul over 6 years ago

Related to deleted (action #45011: Allow detection of known failures at the autoinst-log.txt)

Actions

Copy link

#22

Updated by agraul over 6 years ago

Blocked by action #45011: Allow detection of known failures at the autoinst-log.txt added

Actions

Copy link

#23

Updated by agraul over 6 years ago

Status changed from Feedback to Blocked

#45011

Actions

Copy link

#24

Updated by okurz over 6 years ago

Due date changed from 2018-08-28 to 2019-03-12

due to changes in a related task

Actions

Copy link

#25

Updated by okurz about 6 years ago

Due date changed from 2019-03-12 to 2019-06-30

due to changes in a related task

Actions

Copy link

#26

Updated by okurz almost 6 years ago

Assignee changed from okurz to riafarov

Move to new QSF-y PO after I moved to the "tools"-team. I mainly checked the subject line so in individual instances you might not agree to take it over completely into QSF-y. Feel free to reassign to me or someone else in this case. Thanks.

Actions

Copy link

#27

Updated by riafarov almost 6 years ago

Blocks deleted (action #27004: [opensuse][sle][functional][yast][y][hard] yast2 gui modules fail to start in the defined time frame)

Actions

Copy link

#28

Updated by riafarov almost 6 years ago

Due date changed from 2019-06-30 to 2019-08-06

due to changes in a related task

Actions

Copy link

#29

Updated by riafarov almost 6 years ago

Due date changed from 2019-08-06 to 2019-12-31

due to changes in a related task

Actions

Copy link

#30

Updated by okurz over 5 years ago

Related to action #57452: Automatic summary of failures added

Actions

Copy link

#31

Updated by okurz over 5 years ago

Using https://github.com/os-autoinst/scripts/blob/master/monitor-openqa_job and https://github.com/os-autoinst/scripts/blob/master/openqa-label-known-issues I setup a gitlab CI pipeline in https://gitlab.suse.de/openqa/auto-review/ that automatically labels (and restarts) incompletes for which we know the reasons. The approach could also be extended to cover not only incompletes.

Actions

Copy link

#32

Updated by okurz over 5 years ago

Related to coordination #19720: [epic] Simplify investigation of job failures added

Actions

Copy link

#33

Updated by riafarov over 5 years ago

Assignee changed from riafarov to okurz

As it's mainly tools team working on this epic, @okurz I will set you as an assignee to track the progress. Feel free to change it, I rely on your expertise to set more suitable person if it's not you. Thanks!

Actions

Copy link

#34

Updated by okurz over 5 years ago

Subject changed from [functional][y][u][epic] Detect "known failures" and mark jobs as such to [epic] Detect "known failures" and mark jobs as such

that's ok, it's me :)

There is currently only one subtask open #46988 on QSF-u though.

Actions

Copy link

#35

Updated by okurz over 5 years ago

Due date changed from 2019-12-31 to 2020-12-31

due to changes in a related task

Actions

Copy link

#36

Updated by okurz about 5 years ago

Subject changed from [epic] Detect "known failures" and mark jobs as such to [saga] Detect "known failures" and mark jobs as such

Actions

Copy link

#37

Updated by okurz about 5 years ago

Subject changed from [saga] Detect "known failures" and mark jobs as such to [saga][epic] Detect "known failures" and mark jobs as such

Actions

Copy link

#38

Updated by SLindoMansilla about 5 years ago

Due date changed from 2020-12-31 to 2020-03-27

due to changes in a related task: #46988

Actions

Copy link

#39

Updated by okurz almost 5 years ago

Due date changed from 2020-06-09 to 2020-03-27

due to changes in a related task: #62420

Actions

Copy link

#40

Updated by okurz almost 5 years ago

Due date changed from 2018-08-28 to 2020-03-27

due to changes in a related task: #38621

Actions

Copy link

#41

Updated by szarate over 4 years ago

Tracker changed from action to coordination
Status changed from Blocked to New

Actions

Copy link

#42

Updated by szarate over 4 years ago

See for the reason of tracker change: http://mailman.suse.de/mailman/private/qa-sle/2020-October/002722.html

Actions

Copy link

#43

Updated by okurz over 4 years ago

Status changed from New to Blocked
Target version changed from future to Ready

Discussed the topic of "auto-review" with SUSE QA Tools team and the general opinion was that this epic is interesting to follow up with so putting it to the backlog now.

Actions

Copy link

#44

Updated by okurz over 4 years ago

Subject changed from [saga][epic] Detect "known failures" and mark jobs as such to [saga][epic] Detect "known failures" and mark jobs as such to make tests more stable, reviewing test results and tracking known issues easier

Actions

Copy link

#45

Updated by livdywan over 4 years ago

Once again wondering: where's the due date coming from? It's not visible. Do we need to go through every single ticket again to check?

Actions

Copy link

#46

Updated by okurz over 4 years ago

Maybe the API helps to find that easily but in this case it's #80264

Actions

Copy link

#47

Updated by okurz about 4 years ago

Subject changed from [saga][epic] Detect "known failures" and mark jobs as such to make tests more stable, reviewing test results and tracking known issues easier to [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

Actions

Copy link

#48

Updated by okurz over 3 years ago

Copied to coordination #102906: [saga][epic] Increased stability of tests with less "known failures", known incompletes handled automatically within openQA added

Actions

Copy link

#49

Updated by okurz over 3 years ago

Blocked by deleted (action #45011: Allow detection of known failures at the autoinst-log.txt)

Actions

Copy link

#50

Updated by okurz over 3 years ago

Related to action #45011: Allow detection of known failures at the autoinst-log.txt added

Actions

Copy link

#51

Updated by okurz over 3 years ago

Status changed from Blocked to Resolved

I checked all comments and suggestions from the description again. To be strict I would say we are not yet completely there at the original envisioned goal assuming that there would be an according automation to provide what we need but to better separate the parts that are already done and which provide all the information that is currently possible on test results I consider this saga resolved. There is follow-up work planned to automatically classify test results, see #102915 as well as handling more issues directly within os-autoinst or openQA #102906. So, work concluded! Thx to all contributors.

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries