Actions
coordination #110833
open[saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances
Start date:
2022-05-09
Due date:
% Done:
80%
Estimated time:
(Total: 0.00 h)
Description
Motivation¶
- See #110785
Ideas¶
- Test locally by scheduling something like 100k jobs and see how the scheduler scales
- Test locally by scheduling many jobs on something like 1k worker instances and see how the scheduler scales
- Note that there's a unit test for scalability which one might simply invoke with very high numbers for scheduled jobs and available workers
Updated by okurz over 2 years ago
- Copied from coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results added
Updated by okurz over 2 years ago
- Related to action #110785: OSD incident 2022-05-09: Many scheduled jobs not picked up despite idle workers, blocked by one worker instance that should be broken? added
Updated by okurz 5 months ago
- Copied to coordination #164466: [saga][epic] Scale up: Hyper-responsive openQA webUI added
Updated by okurz 3 months ago
- Related to action #167557: OSD not starting new jobs on 2024-09-28 due to >1k worker instances connected, overloading websocket server added
Actions