action #166403
closedMunin - minion hook failed - see openqa-gru service logs for details - 404 Not Found size:S
Description
Observation¶
Date: Wed, 04 Sep 2024 12:05:07 +0000
Subject: Munin - minion hook failed - see openqa-gru service logs for details - opensuse.org :: openqa.opensuse.org
opensuse.org :: openqa.opensuse.org :: hook failed - see openqa-gru service logs for details
CRITICALs: rc_failed_per_5min is 32.00 (outside range [:10]).
% journalctl -u openqa-gru --since '2024-09-04'
...
Sep 04 12:02:24 ariel openqa-gru[7200]: 404 Not Found
Sep 04 12:02:44 ariel openqa-gru[7200]:
Sep 04 12:02:44 ariel openqa-gru[11188]: 'http://openqa.opensuse.org/tests/4454336' does not have autoinst-log.txt but is rather old, ignoring
Sep 04 12:03:28 ariel openqa-gru[18613]: 404 Not Found
Sep 04 12:03:31 ariel openqa-gru[18613]:
Sep 04 12:03:31 ariel openqa-gru[18944]: 404 Not Found
Sep 04 12:03:36 ariel openqa-gru[18944]:
Sep 04 12:03:36 ariel openqa-gru[20180]: 404 Not Found
Sep 04 12:03:36 ariel openqa-gru[20180]:
Sep 04 12:03:36 ariel openqa-gru[20178]: 404 Not Found
Sep 04 12:03:42 ariel openqa-gru[20178]:
Sep 04 12:03:42 ariel openqa-gru[21430]: 404 Not Found
Sep 04 12:03:42 ariel openqa-gru[21430]:
Sep 04 12:03:42 ariel openqa-gru[21445]: 404 Not Found
Sep 04 12:03:49 ariel openqa-gru[21445]:
Sep 04 12:03:49 ariel openqa-gru[22756]: 404 Not Found
Sep 04 12:03:51 ariel openqa-gru[22756]:
Sep 04 12:03:51 ariel openqa-gru[23156]: 404 Not Found
Sep 04 12:03:54 ariel openqa-gru[23156]:
Sep 04 12:03:54 ariel openqa-gru[23770]: 404 Not Found
Sep 04 12:03:55 ariel openqa-gru[23770]:
Sep 04 12:03:55 ariel openqa-gru[24146]: 404 Not Found
Sep 04 12:03:56 ariel openqa-gru[24146]:
Sep 04 12:03:56 ariel openqa-gru[24646]: 404 Not Found
Sep 04 12:04:07 ariel openqa-gru[24646]:
...
Unfortunately we don't see a script or line number, although we are using a mechanism, e.g. in runcurl
to report the caller in case of an error. Maybe this is a call that doesn't use that wrapper.
One related minion job (guessing from the timestamp) is this, I guess:
https://openqa.opensuse.org/minion/jobs?id=4272268
notes:
hook_cmd: env from_email=o3-admins@suse.de scheme=http enable_force_result=true
email_unreviewed=true exclude_group_regex='(Development|Open Build Service|Others|Kernel).*/.*'
/opt/os-autoinst-scripts/openqa-label-known-issues-and-investigate-hook
hook_rc: 1
hook_result: ''
https://openqa.opensuse.org/tests/4454693
Acceptance Criteria¶
- AC1: Errors include line number and URL
Suggestions¶
- Search for curl calls not using runcurl
- Try to reproduce by calling the hook script on the same job
- https://github.com/os-autoinst/scripts/blob/master/openqa-label-known-issues-and-investigate-hook
Updated by tinita 2 months ago
- Has duplicate action #166406: Munin - minion hook failed - see openqa-gru service logs for details - opensuse.org :: openqa.opensuse.org added
Updated by tinita 2 months ago
Did you run the script on the one mentioned job where the hook_rc was 1 to see if you can reproduce? https://openqa.opensuse.org/tests/4454693
Updated by tinita 2 months ago ยท Edited
I was curious and did, and then realized that the original job is gone and this comes from openqa-investigate:
origin_job_data=$(client-get-job "$origin_job_id") || return $?
I think your PR makes sense, but I think if in an investigation job we don't have the original job anymore, we can just return zero.
So maybe you can check here for a 404 as well?
Updated by mkittler 2 months ago
- Status changed from Feedback to Resolved
I think your PR makes sense, but I think if in an investigation job we don't have the original job anymore, we can just return zero.
I think so, too. That's what I changed in commit https://github.com/os-autoinst/scripts/pull/344/commits/dbc759f473497c12d1e87027239ffb12c86ea738 of the mentioned PR (and I tested it as mentioned on https://github.com/os-autoinst/scripts/pull/344#issue-2510664472).
With the PR merged I consider this ticket resolved.
Updated by tinita 2 months ago
- Status changed from Resolved to Feedback
Did you test this on the actual job I mentioned here?
I did have a reason for asking :)
Your PR is about the actual job that the hook is running on.
My comment is about the original job, in case the hook is running on an investigation job.
And that you can only see if you run the script on https://openqa.opensuse.org/tests/4454693 because the job exists but the original job was deleted.
That was one of the tests that the script failed on initially.
the line I mean is this: https://github.com/os-autoinst/scripts/blob/8a537a6dcd97c0f67dd784264af70288dd618843/openqa-investigate#L286
origin_job_data=$(client-get-job "$origin_job_id") || return $?