action #40223

Document GRU cleanup strategy

Added by riafarov over 1 year ago. Updated over 1 year ago.

Status:ResolvedStart date:24/08/2018
Priority:NormalDue date:
Assignee:mkittler% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

Description

We have following job: https://openqa.opensuse.org/tests/741176# started at 8am

However, GRU has removed one of the repos [2018-08-24T01:21:01.0763 UTC] [info] GRU: removing /var/lib/openqa/share/factory/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot20180820-debuginfo
So job fails. GRU hasn't removed repo in REPO_0 variable though. Assets are not older than 14 days and there is plenty of space available.


Related issues

Related to openQA Tests - action #40256: [opensuse][functional][y] Verify if repo is available bef... Resolved 27/08/2018 25/09/2018
Related to openQA Tests - action #40349: [sle][functional][u][migration][sle15sp1] Jobs incomplete... Resolved 28/08/2018

History

#1 Updated by szarate over 1 year ago

  • Target version set to Ready

We seem to be having this problem in osd too

#2 Updated by mkittler over 1 year ago

  • Target version deleted (Ready)

@szarate And my PR which might affect this hasn't even been merged yet. I note this because it might make sense to merge these changes first before trying to fix this issue.

Assets are not older than 14 days and there is plenty of space available.

That limit is only applied to jobs which aren't in a group. But yes, due to a bug it could have been applied here accidentally.

and there is plenty of space available

The job you mention in the ticket summary is in group openSUSE Tumbleweed AArch64 where 78 GiB of 80 GiB are used. I wouldn't say that 2 GiB is 'plenty of space'. Or do you think the value of 78 GiB is wrongly computed?

#3 Updated by riafarov over 1 year ago

@mkittler I was not aware of that setting to be honest, I meant disk space available. So we can actually increase space for it. However, still 80GB is enough to keep latest assets at least and not remove some of them.

#4 Updated by mkittler over 1 year ago

  • Target version set to Ready

However, still 80GB is enough to keep latest assets at least and not remove some of them.

Yes, it should of course remove preferably old assets and removing assets from pending jobs should be prevented at all. So yes, it is a bug. (I removed the target version only by mistake.)

It would be interesting whether this still occurs after my optimization/refactoring. When doing that, I also added some more tests (which, however are only done with a few assets from our fixtures).

#5 Updated by riafarov over 1 year ago

  • Related to action #40256: [opensuse][functional][y] Verify if repo is available before adding it in zypper_ar test module added

#6 Updated by szarate over 1 year ago

  • Related to action #40349: [sle][functional][u][migration][sle15sp1] Jobs incomplete due to missing assets added

#7 Updated by coolo over 1 year ago

  • Subject changed from [tools] gru cleans up assets too early to Document GRU cleanup strategy
  • Category set to 140
  • Target version changed from Ready to Current Sprint

GRU doesn't care for the age of the asset - only about the age of the job using it. So if there are only old jobs attached to that asset, it will remove it.

So if your job group isn't large enough to hold the current assets - you need to increase it, no way around it. That's why we put so many details in /admin/assets page.

#8 Updated by mkittler over 1 year ago

  • Assignee set to mkittler

#9 Updated by mkittler over 1 year ago

  • Status changed from New to In Progress

#10 Updated by mkittler over 1 year ago

  • Status changed from In Progress to Resolved

PR is merged

#11 Updated by coolo over 1 year ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF