action #166403
closedMunin - minion hook failed - see openqa-gru service logs for details - 404 Not Found size:S
Description
Observation¶
Date: Wed, 04 Sep 2024 12:05:07 +0000
Subject: Munin - minion hook failed - see openqa-gru service logs for details - opensuse.org :: openqa.opensuse.org
opensuse.org :: openqa.opensuse.org :: hook failed - see openqa-gru service logs for details
CRITICALs: rc_failed_per_5min is 32.00 (outside range [:10]).
% journalctl -u openqa-gru --since '2024-09-04'
...
Sep 04 12:02:24 ariel openqa-gru[7200]: 404 Not Found
Sep 04 12:02:44 ariel openqa-gru[7200]:
Sep 04 12:02:44 ariel openqa-gru[11188]: 'http://openqa.opensuse.org/tests/4454336' does not have autoinst-log.txt but is rather old, ignoring
Sep 04 12:03:28 ariel openqa-gru[18613]: 404 Not Found
Sep 04 12:03:31 ariel openqa-gru[18613]:
Sep 04 12:03:31 ariel openqa-gru[18944]: 404 Not Found
Sep 04 12:03:36 ariel openqa-gru[18944]:
Sep 04 12:03:36 ariel openqa-gru[20180]: 404 Not Found
Sep 04 12:03:36 ariel openqa-gru[20180]:
Sep 04 12:03:36 ariel openqa-gru[20178]: 404 Not Found
Sep 04 12:03:42 ariel openqa-gru[20178]:
Sep 04 12:03:42 ariel openqa-gru[21430]: 404 Not Found
Sep 04 12:03:42 ariel openqa-gru[21430]:
Sep 04 12:03:42 ariel openqa-gru[21445]: 404 Not Found
Sep 04 12:03:49 ariel openqa-gru[21445]:
Sep 04 12:03:49 ariel openqa-gru[22756]: 404 Not Found
Sep 04 12:03:51 ariel openqa-gru[22756]:
Sep 04 12:03:51 ariel openqa-gru[23156]: 404 Not Found
Sep 04 12:03:54 ariel openqa-gru[23156]:
Sep 04 12:03:54 ariel openqa-gru[23770]: 404 Not Found
Sep 04 12:03:55 ariel openqa-gru[23770]:
Sep 04 12:03:55 ariel openqa-gru[24146]: 404 Not Found
Sep 04 12:03:56 ariel openqa-gru[24146]:
Sep 04 12:03:56 ariel openqa-gru[24646]: 404 Not Found
Sep 04 12:04:07 ariel openqa-gru[24646]:
...
Unfortunately we don't see a script or line number, although we are using a mechanism, e.g. in runcurl
to report the caller in case of an error. Maybe this is a call that doesn't use that wrapper.
One related minion job (guessing from the timestamp) is this, I guess:
https://openqa.opensuse.org/minion/jobs?id=4272268
notes:
hook_cmd: env from_email=o3-admins@suse.de scheme=http enable_force_result=true
email_unreviewed=true exclude_group_regex='(Development|Open Build Service|Others|Kernel).*/.*'
/opt/os-autoinst-scripts/openqa-label-known-issues-and-investigate-hook
hook_rc: 1
hook_result: ''
https://openqa.opensuse.org/tests/4454693
Acceptance Criteria¶
- AC1: Errors include line number and URL
Suggestions¶
- Search for curl calls not using runcurl
- Try to reproduce by calling the hook script on the same job
- https://github.com/os-autoinst/scripts/blob/master/openqa-label-known-issues-and-investigate-hook
Updated by livdywan about 1 month ago
- Subject changed from Munin - minion hook failed - see openqa-gru service logs for details to Munin - minion hook failed - see openqa-gru service logs for details size:s
- Description updated (diff)
- Status changed from New to Workable
Updated by tinita about 1 month ago
- Subject changed from Munin - minion hook failed - see openqa-gru service logs for details size:s to Munin - minion hook failed - see openqa-gru service logs for details - 404 Not Found size:S
Updated by tinita about 1 month ago
- Has duplicate action #166406: Munin - minion hook failed - see openqa-gru service logs for details - opensuse.org :: openqa.opensuse.org added
Updated by mkittler about 1 month ago
- Status changed from Workable to In Progress
- Assignee set to mkittler
Updated by mkittler about 1 month ago
- Status changed from In Progress to Resolved
Updated by tinita about 1 month ago
Did you run the script on the one mentioned job where the hook_rc was 1 to see if you can reproduce? https://openqa.opensuse.org/tests/4454693
Updated by tinita about 1 month ago ยท Edited
I was curious and did, and then realized that the original job is gone and this comes from openqa-investigate:
origin_job_data=$(client-get-job "$origin_job_id") || return $?
I think your PR makes sense, but I think if in an investigation job we don't have the original job anymore, we can just return zero.
So maybe you can check here for a 404 as well?
Updated by mkittler about 1 month ago
- Status changed from Feedback to Resolved
I think your PR makes sense, but I think if in an investigation job we don't have the original job anymore, we can just return zero.
I think so, too. That's what I changed in commit https://github.com/os-autoinst/scripts/pull/344/commits/dbc759f473497c12d1e87027239ffb12c86ea738 of the mentioned PR (and I tested it as mentioned on https://github.com/os-autoinst/scripts/pull/344#issue-2510664472).
With the PR merged I consider this ticket resolved.
Updated by tinita about 1 month ago
- Status changed from Resolved to Feedback
Did you test this on the actual job I mentioned here?
I did have a reason for asking :)
Your PR is about the actual job that the hook is running on.
My comment is about the original job, in case the hook is running on an investigation job.
And that you can only see if you run the script on https://openqa.opensuse.org/tests/4454693 because the job exists but the original job was deleted.
That was one of the tests that the script failed on initially.
the line I mean is this: https://github.com/os-autoinst/scripts/blob/8a537a6dcd97c0f67dd784264af70288dd618843/openqa-investigate#L286
origin_job_data=$(client-get-job "$origin_job_id") || return $?
Updated by mkittler about 1 month ago
- Status changed from Feedback to In Progress
Ah, I didn't get the distinction. I'll test it and probably I'll also need to tweak the code accordingly.
Updated by mkittler about 1 month ago
- Status changed from In Progress to Feedback