Project

General

Profile

action #158125

Updated by okurz about 1 month ago

## Motivation 
 In #158104 we observed typing issues due to mania being overloaded. mania was configured to run 30 openQA worker instances and that was mostly fine as proven in #139271-24. The recent overload was likely triggered by enabling video again as part of #157636. I already reduced the number of worker instances. But this has the drawback that again the long test backlog takes longer to be finished. We should be more flexible in using available ressource. Here I suggest to implement a check in the worker to only pick up new jobs if CPU load is below a configured threshold. 

 ## Acceptance criteria 
 * **AC1:** An openQA worker does not start an openQA job if the CPU load is higher than configured threshold 
 * **AC2:** By default worker still pick up jobs if load is not too high 

 ## Suggestions 
 * Possibly the worker code somewhere in https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Worker.pm#L472 can be extended to check the cpu load and if it exceeds a (configurable) threshold then skip picking up any next job 
 * Add a sensible disabled default value in https://github.com/os-autoinst/openQA/blob/master/etc/openqa/workers.ini with an explanation comment

Back