Project

General

Profile

Actions

action #165716

closed

coordination #102915: [saga][epic] Automated classification of failures

coordination #166655: [epic] openqa-label-known-issues

[o3] Munin - minion hook failed - /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68 size:M

Added by tinita 3 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2024-08-23
Due date:
% Done:

0%

Estimated time:

Description

Observation

We got an alert for o3:
opensuse.org :: openqa.opensuse.org :: hook failed - see openqa-gru service logs for details
WARNINGs: rc_failed_per_5min is 8.00 (outside range [:5]).

Here are the problematic lines in the journal:

sudo journalctl -u openqa-gru --since '2024-08-23'
Aug 23 00:03:23 ariel systemd[1]: Stopping The openQA daemon for various background tasks like cleanup and saving needles...
Aug 23 00:03:25 ariel systemd[1]: openqa-gru.service: Deactivated successfully.
Aug 23 00:03:25 ariel systemd[1]: Stopped The openQA daemon for various background tasks like cleanup and saving needles.
Aug 23 00:03:25 ariel systemd[1]: Started The openQA daemon for various background tasks like cleanup and saving needles.
Aug 23 08:03:03 ariel openqa-gru[18277]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:03:12 ariel openqa-gru[18715]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:03:26 ariel openqa-gru[19152]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:03:44 ariel openqa-gru[19454]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:04:06 ariel openqa-gru[19770]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:04:22 ariel openqa-gru[20283]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:04:35 ariel openqa-gru[20569]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:04:40 ariel openqa-gru[20836]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:05:43 ariel openqa-gru[22016]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68
Aug 23 08:06:45 ariel openqa-gru[24067]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68

I can find the corresponding minion entries. They have a hook_rc of 1, but unfortunately no useful output.
https://openqa.opensuse.org/minion/jobs?id=4223339
https://openqa.opensuse.org/minion/jobs?id=4223321
https://openqa.opensuse.org/minion/jobs?id=4223308

We also have a few of those errors on osd.

The first error I can find on o3 is from August 18:

Aug 18 01:30:04 ariel openqa-gru[31244]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68

For osd it's the 16:

Aug 16 11:57:21 openqa openqa-gru[7266]: /opt/os-autoinst-scripts/openqa-label-known-issues: ERROR: line 68

More detail

investigate_issue

We write autoinst-log.txt and reason into the same file.
If we successfully got autoinst-log.txt (http 200 or 301), we continue with trying to label the test. DONE.

No autoinst-log.txt

  • If we couldn't fetch autoinst-log.txt, check if there is a general issue. handle_unreachable performs various tests and should return non-zero to indicate that we shouldn't go on trying to label the test.
  • If the http status was not 404, don't continue with labeling.
  • If the job is too old, don't continue with labeling.
  • If there is no reason as well, don't continue with labeling. Only if the http status is 404, the job is not too old and the reason is set, continue with labeling.

Acceptance Criteria

  • AC1: Hook script does not abort when label_on_issues_from_issue_tracker does return non-zero
  • AC2: The relevant part of the script is tested
  • AC3: Behaviour from before the previous ticket/PR is reinstated

Related issues 3 (2 open1 closed)

Related to openQA Project - action #164296: openqa-label-known-issues does not look at known issues if autoinst-log.txt does not exist but reason could be looked at size:SResolvedybonatakis

Actions
Related to openQA Project - action #166649: Rewrite openqa-label-known-issues in Python or another better maintainable languageNew

Actions
Related to openQA Project - action #166772: openqa-label-known-issues overrides size:SIn Progressybonatakis2024-09-132024-11-27

Actions
Actions

Also available in: Atom PDF