Project

General

Profile

Actions

action #164296

closed

coordination #102915: [saga][epic] Automated classification of failures

coordination #166655: [epic] openqa-label-known-issues

openqa-label-known-issues does not look at known issues if autoinst-log.txt does not exist but reason could be looked at size:S

Added by okurz 3 months ago. Updated 30 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Observation

As observed in #162038 openQA jobs ended up with no autoinst-log.txt because the files could not be uploaded due to "timestamp mismatch". That error is actually written in the "reason" field but openqa-label-known-issues does not look at this field if autoinst-log.txt is not present, e.g. due to the reason that the job could be too old and shouldn't be looked at anyway.

Acceptance criteria

  • AC1: openqa-label-known-issues also checks the "reason" field even if autoinst-log.txt does not exist at all
  • AC2: old jobs where autoinst-log.txt is already deleted are still not considered for review

Suggestions


Related issues 3 (1 open2 closed)

Related to openQA Project - action #165716: [o3] Munin - minion hook failed - /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68 size:MResolvedybonatakis2024-08-23

Actions
Copied from openQA Infrastructure - action #162038: No HTTP Response on OSD on 10-06-2024 - auto_review:".*timestamp mismatch - check whether clocks on the local host and the web UI host are in sync":retry size:SResolvednicksinger2024-06-10

Actions
Copied to openQA Project - action #166649: Rewrite openqa-label-known-issues in Python or another better maintainable languageNew

Actions
Actions #1

Updated by okurz 3 months ago

  • Copied from action #162038: No HTTP Response on OSD on 10-06-2024 - auto_review:".*timestamp mismatch - check whether clocks on the local host and the web UI host are in sync":retry size:S added
Actions #2

Updated by okurz 3 months ago

  • Project changed from openQA Infrastructure to openQA Project
  • Category changed from Feature requests to Feature requests
Actions #3

Updated by tinita 3 months ago

  • Subject changed from openqa-label-known-issues does not look at known issues if autoinst-log.txt does not exist but reason could be looked at to openqa-label-known-issues does not look at known issues if autoinst-log.txt does not exist but reason could be looked at size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #4

Updated by ybonatakis 2 months ago

  • Status changed from Workable to In Progress
  • Assignee set to ybonatakis
Actions #6

Updated by ybonatakis 2 months ago

output as last change:

here with reason=null + autoinst-log.txt

❯ dry_run=1 host=openqa.suse.de ./openqa-label-known-issues https://openqa.suse.de/tests/15089626
Requesting jobs/15089626 via openqa-cli
[https://openqa.suse.de/tests/15089626](https://openqa.suse.de/tests/15089626): Unknown test issue, to be reviewed
-> [autoinst-log.txt](https://openqa.suse.de/tests/15089626/file/autoinst-log.txt)

Last lines before SUT shutdown:

    # --- 8< ---
    # [2024-08-05T12:10:26.164521+02:00] [debug] [pid:126110] QEMU status is not 'shutdown', it is 'running'
    # [2024-08-05T12:10:26.164636+02:00] [debug] [pid:125698] backend shutdown state: 
    # [2024-08-05T12:10:26.164988+02:00] [info] [pid:126110] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
    # [2024-08-05T12:10:27.220847+02:00] [debug] [pid:126110] Passing remaining frames to the video encoder
    # frame=  771 fps=1.9 q=0.0 Lsize=    1531kB time=00:00:32.08 bitrate= 390.9kbits/s speed=0.0777x    
    # video:1525kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.355806%
    # [2024-08-05T12:10:27.961098+02:00] [debug] [pid:126110] Waiting for video encoder to finalize the video
    # [2024-08-05T12:10:27.961202+02:00] [debug] [pid:126110] The built-in video encoder (pid 126494) terminated
    # [2024-08-05T12:10:27.961265+02:00] [debug] [pid:126110] The external video encoder (pid 126492) terminated
    # [2024-08-05T12:10:27.961832+02:00] [debug] [pid:126110] QEMU: qemu-system-x86_64: terminating on signal 15 from pid 126110 (/usr/bin/isotovideo: backen)
    # --- >8 ---

here with reason=null + without autoinst-log.txt

❯ dry_run=1 host=openqa.suse.de ./openqa-label-known-issues https://openqa.suse.de/tests/14890572
Requesting jobs/14890572 via openqa-cli
reason null
'https://openqa.suse.de/tests/14890572' does not have autoinst-log.txt but is rather old, ignoring
[https://openqa.suse.de/tests/14890572](https://openqa.suse.de/tests/14890572): Unknown test issue, to be reviewed
-> [autoinst-log.txt](https://openqa.suse.de/tests/14890572/file/autoinst-log.txt)

Last lines before SUT shutdown:

    # --- 8< ---

    # --- >8 ---

here with reason=api failure and no autoinst-log.txt

❯ dry_run=1 host=openqa.suse.de ./openqa-label-known-issues https://openqa.suse.de/tests/14897579
Requesting jobs/14897579 via openqa-cli
'https://openqa.suse.de/tests/14897579' does not have autoinst-log.txt but is rather old, ignoring
foo
openqa-cli api --header User-Agent: openqa-label-known-issues (https://github.com/os-autoinst/scripts) --host https://openqa.suse.de --retries=3 -X POST jobs/14897579/comments text=poo#73375 Job incompletes with reason auto_review:"(?m)api failure$" (and no further details)
Actions #7

Updated by ybonatakis 2 months ago

  • Status changed from In Progress to Feedback

another small PR out of this ticket scope https://github.com/os-autoinst/scripts/pull/336

Actions #8

Updated by ybonatakis 2 months ago

Note from @titina on the PR's review
There is already handling in handle_unreachable_or_no_log that will additionally check if the job itself is deleted or not, so I think we shouldn't unconditionally go on here.

which we might want to take a look further.

Actions #9

Updated by ybonatakis 2 months ago

  • Status changed from Feedback to In Progress
Actions #10

Updated by ybonatakis 2 months ago · Edited

Thanks to Tina and Oli I dig deeper to find a solution. I havent pushed anything yet. it is in progress but we might discuss it in the unblock. However I found https://github.com/os-autoinst/scripts/blob/master/_common#L174 which doesnt seem to be used anywhere.

Anyhow, this is how the script'soutput looks like from the latest attempts

❯ dry_run=1 host=openqa.suse.de ./openqa-label-known-issues https://openqa.suse.de/tests/14897579
Requesting jobs/14897579 via openqa-cli
curl ret 404
reason api failure
from label_on_issues_from_issue_tracker
openqa-cli api --header User-Agent: openqa-label-known-issues (https://github.com/os-autoinst/scripts) --host https://openqa.suse.de --retries=3 -X POST jobs/14897579/comments text=poo#73375 Job incompletes with reason auto_review:"(?m)api failure$" (and no further details)
'https://openqa.suse.de/tests/14897579' does not have autoinst-log.txt but is rather old, ignoring
Actions #11

Updated by openqa_review 2 months ago

  • Due date set to 2024-08-21

Setting due date based on mean cycle time of SUSE QE Tools

Actions #12

Updated by ybonatakis 2 months ago

  • Status changed from In Progress to Feedback
Actions #13

Updated by okurz 2 months ago

Based on our discussion in the unblock my suggestion is to split the function handle_unreachable_or_no_log into handle_unreachable and handle_no_log and then call like this:

if unreachable then handle_unreachable
elif label
elif handle_no_log
…

Additionally or alternatively tinita was suggesting to handle the exit code with || rc=$? and then decide from a certain exit code to actually exit or return or go on with labelling.

Actions #15

Updated by ybonatakis about 2 months ago

  • Status changed from Feedback to Resolved

Merged with some neglected improvements recommended by Tina. Close it for now

Actions #16

Updated by tinita about 2 months ago

  • Related to action #165716: [o3] Munin - minion hook failed - /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68 size:M added
Actions #17

Updated by ybonatakis about 2 months ago · Edited

continue on #165716

Actions #18

Updated by livdywan 30 days ago

  • Copied to action #166649: Rewrite openqa-label-known-issues in Python or another better maintainable language added
Actions #19

Updated by okurz 30 days ago

  • Parent task set to #166655
Actions #20

Updated by okurz 30 days ago

  • Due date deleted (2024-08-21)
Actions

Also available in: Atom PDF