Actions
coordination #102864
opencoordination #102861: [saga][epic] Improved openQA for multi-user environments
[epic] Inform openQA webUI users about potential worker class mismatch or long delays
Start date:
2021-09-13
Due date:
% Done:
0%
Estimated time:
Description
Motivation¶
In #98562 the idea came to cancel jobs with "invalid" worker class but that is time dependant. Then in #100973 we implemented automatic cancellation of all jobs after a (longer) timeout so that jobs don't hang around forever. Now we can go the next step and improve the feedback to users about potential worker class mismatches or expected long delays in job execution
Acceptance criteria¶
- AC1: Given a scheduled job When worker class does not match any worker entry Then inform user about that fact and that the job is likely misconfigured
- AC2: Given a scheduled job When worker class does match a worker entry And there are currently no online workers for this worker class And the last online time is below a configurable threshold, e.g. 10 minutes, Then inform user about that fact and that the job will likely be executed later
- AC3: Given a scheduled job When worker class does match a worker entry And there are currently no online workers for this worker class And the last online time is above a configurable threshold, e.g. 10 minutes, Then inform user about that fact and that there is likely an infrastructure problem and admins should be contacted
- AC4: Given a scheduled job When worker class does match a worker entry And there are currently no free workers for this worker class And the ratio of "scheduled for this worker class / available worker instances for this worker class" is high Then inform user about to be expected longer delays
Actions