coordination #110833
[saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances
Start date:
2022-05-09
Due date:
% Done:
100%
Estimated time:
(Total: 0.00 h)
Difficulty:
Description
Motivation¶
- See #110785
Ideas¶
- Test locally by scheduling something like 100k jobs and see how the scheduler scales
- Test locally by scheduling many jobs on something like 1k worker instances and see how the scheduler scales
- Note that there's a unit test for scalability which one might simply invoke with very high numbers for scheduled jobs and available workers
Subtasks
Related issues
History
#1
Updated by okurz about 2 months ago
- Copied from coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results added
#2
Updated by okurz about 2 months ago
- Related to action #110785: OSD incident 2022-05-09: Many scheduled jobs not picked up despite idle workers, blocked by one worker instance that should be broken? added