Actions
action #178249
opencoordination #176337: [saga][epic] Stable os-autoinst backends with stable command execution (no mistyping)
coordination #176340: [epic] Stable qemu backend with no unexpected mistyping
load detection and job flagging under high load conditions in openQA job execution size:S
Start date:
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Motivation¶
Related to point 2 of #175060#note-13
Recent test failures in openQA seems that can be related to the high system load in circleCI. Similar situations can happen in any environment running openQA so we should introduce a load detection during openQA job execution and flag openQA jobs failing or "flagging" jobs where load was high and result might be tainted
Acceptance criteria¶
- AC1: Failed, Unreviewed openQA jobs clearly show if the load of executing openQA workers was high
- AC2: No too verbose output is shown if the load is not high
Suggestions¶
- https://app.circleci.com/pipelines/github/os-autoinst/openQA/16047/workflows/be893671-47af-4c11-a2a7-a69140b42c57/jobs/154592
- Override the load detection to more easily reproduce the issue?
- Consider the number of CPU's in this context
- Come up with a general definition of "high load" or provide a mechanism easily customized to individual workers
- Should isotovideo be aware of this mechanism? Or provide data at least? isotovideo could print out the system load during execution regardless of final result. Then in one of our investigation scripts or worker we could read this value and present it, e.g. in an openqa-investigate comment. As alternative implement this within the openQA worker (using existing code for the load check while the worker is idling).
Updated by gpuliti 30 days ago
- Precedes action #175060: [sporadic] [Workflow] Failed: os-autoinst/openQA on master / test (7dc9d82) size:M added
Actions