action #40223
closedDocument GRU cleanup strategy
Description
Description¶
We have following job: https://openqa.opensuse.org/tests/741176# started at 8am
However, GRU has removed one of the repos [2018-08-24T01:21:01.0763 UTC] [info] GRU: removing /var/lib/openqa/share/factory/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot20180820-debuginfo
So job fails. GRU hasn't removed repo in REPO_0 variable though. Assets are not older than 14 days and there is plenty of space available.
Updated by szarate over 6 years ago
- Target version set to Ready
We seem to be having this problem in osd too
Updated by mkittler over 6 years ago
- Target version deleted (
Ready)
@szarate And my PR which might affect this hasn't even been merged yet. I note this because it might make sense to merge these changes first before trying to fix this issue.
Assets are not older than 14 days and there is plenty of space available.
That limit is only applied to jobs which aren't in a group. But yes, due to a bug it could have been applied here accidentally.
and there is plenty of space available
The job you mention in the ticket summary is in group openSUSE Tumbleweed AArch64 where 78 GiB of 80 GiB are used. I wouldn't say that 2 GiB is 'plenty of space'. Or do you think the value of 78 GiB is wrongly computed?
Updated by riafarov over 6 years ago
@mkittler I was not aware of that setting to be honest, I meant disk space available. So we can actually increase space for it. However, still 80GB is enough to keep latest assets at least and not remove some of them.
Updated by mkittler over 6 years ago
- Target version set to Ready
However, still 80GB is enough to keep latest assets at least and not remove some of them.
Yes, it should of course remove preferably old assets and removing assets from pending jobs should be prevented at all. So yes, it is a bug. (I removed the target version only by mistake.)
It would be interesting whether this still occurs after my optimization/refactoring. When doing that, I also added some more tests (which, however are only done with a few assets from our fixtures).
Updated by riafarov over 6 years ago
- Related to action #40256: [opensuse][functional][y] Verify if repo is available before adding it in zypper_ar test module added
Updated by szarate over 6 years ago
- Related to action #40349: [sle][functional][u][migration][sle15sp1] Jobs incomplete due to missing assets added
Updated by coolo over 6 years ago
- Subject changed from [tools] gru cleans up assets too early to Document GRU cleanup strategy
- Category set to 140
- Target version changed from Ready to Current Sprint
GRU doesn't care for the age of the asset - only about the age of the job using it. So if there are only old jobs attached to that asset, it will remove it.
So if your job group isn't large enough to hold the current assets - you need to increase it, no way around it. That's why we put so many details in /admin/assets page.
Updated by mkittler about 6 years ago
- Status changed from New to In Progress
Updated by coolo about 6 years ago
- Target version changed from Current Sprint to Done