action #152853
closedcoordination #102915: [saga][epic] Automated classification of failures
QA - coordination #94105: [epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests
Prevent faulty openQA workers causing wrong openqa-investigate conclusions size:M
0%
Description
Motivation¶
With #109920 openqa-investigate can identify clear PRODUCT REGRESSIONS and with #132272 we identify TEST REGRESSIONS and with #151399 we can already identify infrastructure issues. Sometimes openqa-investigate can come or lead to wrong conclusions when different workers on which tests have been executed have an impact. So we should see that openqa-investigate by default is not impacted by that, e.g. by executing investigation jobs only on the very same openQA worker.
Acceptance criteria¶
- AC1: openqa-investigate last_good_test+last_good_build jobs run on the same worker as the original failure if uniquely identifyable by WORKER_CLASS
- AC2: openqa-investigate provides a caution notice if no worker could be uniquely identified
Suggestions¶
- Learn what had been done in #109920 for product regressions and #132272 for test regressions and #151399 for infrastructure regressions
- Identify "worker" by WORKER_CLASS or other means and schedule openqa-investigate last_good_test+last_good_build jobs so that they run on the same worker if uniquely identifyable by WORKER_CLASS
- If no such worker can be uniquely identified then make openqa-investigate provide a caution notice if no worker could be uniquely identified, e.g. either an extension of the openQA comment that is written or extend https://github.com/os-autoinst/scripts/blob/master/README.md#openqa-investigate---automatic-investigation-jobs-with-failure-analysis-in-openqa as we point to that in the comment already
Updated by okurz 11 months ago
- Copied from action #151399: Identify reproducible *infrastructure* issues using openqa-investigate size:M added
Updated by ybonatakis 10 months ago
- Status changed from Workable to In Progress
- Assignee set to ybonatakis
Updated by livdywan 10 months ago
okurz wrote in #note-6:
picking up a ticket that is in "future", i.e. not in our backlog should be the rare exception. How did you find this ticket?
As mentioned in chat. The ticket was in the backlog until yesterday afternoon and we talked about how to approach it.
Updated by ybonatakis 10 months ago
I was reay to submit a draft before i notice that the requirements require WORKER_CLASS only for last_good_test+last_good_build. So i need to review the code tomorrow
Updated by ybonatakis 10 months ago
- Status changed from In Progress to Feedback
https://github.com/os-autoinst/scripts/pull/286 draft ready for reviews
Updated by ybonatakis 10 months ago
- Status changed from Feedback to In Progress
Updated by ybonatakis 10 months ago
- Status changed from In Progress to Feedback
Updated by ybonatakis 10 months ago
- Due date deleted (
2024-02-07)
I have to update the changes as i think it sets wrongly name when it clones multimachine jobs
Updated by ybonatakis 10 months ago
- Status changed from In Progress to Resolved
writeup:
It is difficult to identify the actual worker which the test picked up so we clone the test with exactly list of workers which come from vars.json
Updated by livdywan 9 months ago
okurz wrote in #note-17:
Ok, would be good to have a link to a production openQA investigate comment showing the feature in action
https://openqa.suse.de/tests/13580562#settings
Seems to work as intended. Although I did not see a counter example where it would fail to identify the worker class.