action #64857
closedcoordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues
coordination #62420: [epic] Distinguish all types of incompletes
Put single-line error messages into incomplete reason for "died"
0%
Description
Motivation¶
See #62420#note-17 . In case of incompletes with reason "died" we already put the message "terminated prematurely, see log output for details". Also we output the logs from autoinst-log.txt directly on the "Details" tab when there are no module results. For single simple error messages, e.g. especially single-line error strings from the backend, we should put that string into the reason instead of "see log output for details" to make the error information more easily accessible to both the web UI users as well as readable over API.
Acceptance criteria¶
- AC1: Single line error messages from the backend are put in reason and accessible over job API as well as web UI
- AC2: In case the error is not obvious or for multi-line error messages the reason is still "died: terminated prematurely, see log output for details"
Suggestions¶
- Research how error messages are created by os-autoinst backends, e.g. qemu
- If necessary put error message not only in autoinst-log.txt but accessible to openQA, e.g. in one of the json files read by openQA
- in openQA where we write "died: terminated prematurely, see log output for details" evaluate the length of error messages: if nr_of_lines == 1 then put in reason, else still use "died: terminated prematurely, see log output for details"
Updated by livdywan almost 5 years ago
- Status changed from Workable to In Progress
- Assignee set to livdywan
- Target version set to Current Sprint
Updated by mkittler almost 5 years ago
A simple approach would be: When ever isotovideo or its sub processes encounter a fatal error condition where a single-line error message is available it is written into a file in the pool directory. Subsequent errors are likely only symptoms from tearing down the whole stack and should not override the file once it is there. In the end the worker can simply check for that file and incorporate its contents into the reason.
Updated by okurz almost 5 years ago
- Status changed from In Progress to Workable
- Assignee deleted (
livdywan) - Target version deleted (
Current Sprint)
Don't we already have a json file with the overall job result that could then be extended by one key-value with the single error line?
Updated by mkittler almost 5 years ago
Was it a mistake to unassign @cdywan?
To answer your question: Yes, we could make use of the same file. The current file is updated at a single place and it might be updated from time to time. The "error file" might be created from any place and once it is there it should not be further updated to stay as close at the source of the problem as possible. Due to these differences it might still be the simplest approach to just use a new file and avoid interference between different features/aspects.
Updated by okurz almost 5 years ago
- Status changed from Workable to In Progress
- Assignee set to livdywan
- Target version set to Current Sprint
oh, yes. Honest mistake. I am sorry. Probably had the ticket open in my browser for too long without refreshing and somehow did not realize that it updated in this way. I am ok with new file if you think this is best. I trust you have better experience in this field.
Updated by livdywan over 4 years ago
os-autoinst part here https://github.com/os-autoinst/os-autoinst/pull/1383
Updated by livdywan over 4 years ago
https://github.com/os-autoinst/openQA/pull/2924 OpenQA part here
I do wonder where we want to enforce this being a single line. Do we want to check the number of lines when reading the reason? And how does this relate to what's proposed in #25814#note-15
Updated by okurz over 4 years ago
cdywan wrote:
[…] And how does this relate to what's proposed in #25814#note-15
Well, what I mentioned there: "We can extend that concept to every not-passed job as long as we limit the size that we store in the database, e.g. cut reason string at a reasonable, human-readable length, e.g. 120 characters. This way we can also load the reason synchronously."
So we can keep as much data, e.g. multiple, long lines when we are talking about storing in files and job results but as soon as we parse the reason into the database, e.g. the API call from the worker, we should trim.
Updated by mkittler over 4 years ago
The PR for the openQA part has been merged.
@cdywan Have you already prepared changed for os-autoinst? I've just picked up the ticket #64884 but now I'm realizing that you're working on the same area so it would be better to implement this ticket first.
Updated by mkittler over 4 years ago
- Blocks action #64884: Distinguish test contributor errors from unexpected backend crashes added
Updated by livdywan over 4 years ago
mkittler wrote:
The PR for the openQA part has been merged.
@cdywan Have you already prepared changed for os-autoinst? I've just picked up the ticket #64884 but now I'm realizing that you're working on the same area so it would be better to implement this ticket first.
Yes. Apparently there's a failure in CI which I need to sort out (https://github.com/os-autoinst/os-autoinst/pull/1383), will look into that.
Updated by mkittler over 4 years ago
Your PR has been merged. If that's all what was left to do you can mark the ticket as resolved.
Updated by livdywan over 4 years ago
- Status changed from In Progress to Feedback
Updated by mkittler over 4 years ago
- Status changed from Feedback to Resolved
- Target version deleted (
Current Sprint)
As noted in #64884 (which is built on top of this feature) it works in production.