Document GRU cleanup strategy
We have following job: https://openqa.opensuse.org/tests/741176# started at 8am
However, GRU has removed one of the repos [2018-08-24T01:21:01.0763 UTC] [info] GRU: removing /var/lib/openqa/share/factory/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot20180820-debuginfo
So job fails. GRU hasn't removed repo in REPO_0 variable though. Assets are not older than 14 days and there is plenty of space available.
#2 Updated by mkittler over 1 year ago
- Target version deleted (
@szarate And my PR which might affect this hasn't even been merged yet. I note this because it might make sense to merge these changes first before trying to fix this issue.
Assets are not older than 14 days and there is plenty of space available.
That limit is only applied to jobs which aren't in a group. But yes, due to a bug it could have been applied here accidentally.
and there is plenty of space available
The job you mention in the ticket summary is in group openSUSE Tumbleweed AArch64 where 78 GiB of 80 GiB are used. I wouldn't say that 2 GiB is 'plenty of space'. Or do you think the value of 78 GiB is wrongly computed?
#4 Updated by mkittler over 1 year ago
- Target version set to Ready
However, still 80GB is enough to keep latest assets at least and not remove some of them.
Yes, it should of course remove preferably old assets and removing assets from pending jobs should be prevented at all. So yes, it is a bug. (I removed the target version only by mistake.)
It would be interesting whether this still occurs after my optimization/refactoring. When doing that, I also added some more tests (which, however are only done with a few assets from our fixtures).
#7 Updated by coolo over 1 year ago
- Subject changed from [tools] gru cleans up assets too early to Document GRU cleanup strategy
- Category set to 140
- Target version changed from Ready to Current Sprint
GRU doesn't care for the age of the asset - only about the age of the job using it. So if there are only old jobs attached to that asset, it will remove it.
So if your job group isn't large enough to hold the current assets - you need to increase it, no way around it. That's why we put so many details in /admin/assets page.