action #126677
opensalt-states-openqa fails with 0 errors and result False messages buried in one of several places
0%
Description
Observation¶
Somehow it seems like there's no errors (literally according to salt) but minions are not coming back fine. There's no visible error output. The relevant errors are not easy to find since they may be in an artifact rather than the log itself and listed somewhere half-way into the whole output.
Succeeded: 344 (changed=4)
Failed: 0
[...]
ERROR: Minions returned with non-zero exit code
Acceptance criteria¶
- AC1: Whenever minions yield error codes errors are obvious
Suggestions¶
- Ensure errors are consistently in the same place
- Avoid users having to dig for things like
Result: False
and other messages in several places - Separate errors from various other unrelated messages e.g. at the end of the pipeline log in the best case
Updated by okurz almost 2 years ago
- Tags set to infra, support, log
- Due date set to 2023-04-14
- Status changed from New to Feedback
- Assignee set to okurz
- Target version set to Ready
We already had this multiple times. Whenever salt says "ERROR: Minions returned with non-zero exit code" one needs to look up for "Result: False". Doing that immediately jumps to
https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/1472445#L2067
Name: /var/lib/grafana/dashboards//generic-schort-server.json - Function: file.managed - Result: Clean Started: - 13:30:40.236551 Duration: 79.996 ms
----------
ID: /etc/grafana/provisioning/alerting/dashboard-GDschort-server.yaml
Function: file.managed
Result: False
Comment: Source file salt://monitoring/grafana/alertinig-dashboard-GD.yaml.template not found in saltenv 'base'
Started: 13:30:40.316844
Duration: 10.198 ms
Changes:
so errors can be observed and I see AC1 covered with either scrolling up or looking for "Result: False". That's just a generic approach for error messages which always might need a second look. I think this is acceptable and we can not improve that with reasonable effort. What else can we do?
Updated by livdywan almost 2 years ago
okurz wrote:
so errors can be observed and I see AC1 covered with either scrolling up or looking for "Result: False". That's just a generic approach for error messages which always might need a second look. I think this is acceptable and we can not improve that with reasonable effort. What else can we do?
Interesting. It seems the one case I linked does contain it. I went through quite a few which did not:
Updated by tinita almost 2 years ago
@cdywan In those cases you can find it in the Complete raw log
Updated by okurz almost 2 years ago
so @cdywan do you see the need for us to change anything and can we accept the current state as-is?
Updated by livdywan almost 2 years ago
- Subject changed from salt-states-openqa fails with 0 errors and Minions returned with non-zero exit code to salt-states-openqa fails with 0 errors and result False messages buried in one of several places
- Description updated (diff)
- Status changed from Feedback to New
- Assignee deleted (
okurz) - Target version deleted (
Ready)
okurz wrote:
so @cdywan do you see the need for us to change anything [or] can we accept the current state as-is?
I'm turning it into a feature request. Having gotten confused by this before I don't expect to get used to it.