action #97412

openQA Project - coordination #64746: [saga][epic] Scale up: Efficient handling of large storage to be able to run current tests efficiently but keep big archives of old results

openQA Project - action #98463: [epic] Avoid too slow asset downloads leading to jobs exceeding the timeout with or run into auto_review:"(timeout: setup exceeded MAX_SETUP_TIME|Cache service queue already full)":retry

Reduce I/O load on OSD by using more cache size on workers with using free disk space when available instead of hardcoded space

Added by okurz 3 months ago. Updated 3 months ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:



I am sure we could spread the load on OSD when many job start a bit if we manage to have less assets that need to be downloaded at the same time. We can increase the cache size, e.g. use all available space instead of the artifical limits of the cache directory.

The worker cache could just ensure a certain percentage of free disk space in the file system.

Acceptance criteria

  • AC1: At least most workers on OSD use all available free disk space except for a configured ratio to keep free

Related issues

Copied from openQA Infrastructure - action #96554: Mitigate on-going disk I/O alerts size:MResolved2021-08-04


#1 Updated by okurz 3 months ago

  • Copied from action #96554: Mitigate on-going disk I/O alerts size:M added

#2 Updated by okurz 3 months ago

  • Target version changed from Ready to future

#3 Updated by mkittler 3 months ago

  • Parent task changed from #96447 to #98463

#96447 hasn't a very meaningful ticket description, so I'm replacing the parent ticket with #98463.

Also available in: Atom PDF