Prevent depletion of space on /tmp due to mojo.tmp files from os-autoinst
|Target version:||Current Sprint|
https://gitlab.suse.de/openqa/osd-deployment/-/jobs/138860 showed / being overly full on arm-1. Turned out the problem is:
# ls -ltrahS /tmp total 24G … -rw------- 1 _openqa-worker nogroup 969M Sep 28 14:56 mojo.tmp.nq0vLzabb1zWmjGO -rw------- 1 _openqa-worker nogroup 1.1G Sep 28 15:35 mojo.tmp.F3I7oq122kFrKSfR -rw------- 1 _openqa-worker nogroup 1.6G Sep 19 19:20 mojo.tmp.JaJQjPsRKeKN_uNA -rw------- 1 _openqa-worker nogroup 1.8G Sep 28 14:35 mojo.tmp._AGyK7gXhVtqG_nr -rw------- 1 _openqa-worker nogroup 3.5G Sep 19 18:48 mojo.tmp.bsfcU_u1vvxM0DBk -rw------- 1 _openqa-worker nogroup 3.5G Nov 8 10:26 mojo.tmp.An73wQN6Zn7AgEX6 -rw------- 1 _openqa-worker nogroup 5.5G Sep 28 14:29 mojo.tmp.7oNEbwY_XExxQnfQ -rw------- 1 _openqa-worker nogroup 5.6G Sep 28 15:33 mojo.tmp.FOnWm_q01JT9NAEM
mojo.tmp files are mostly uploaded files that mojo stores in the upload process. worker sets $cachedir/tmp for those, webui assets/tmp - so the only place where we don't set a MOJO_TMPDIR afaik is os-autoinst. and about /tmp: https://en.opensuse.org/openSUSE:Tmp_on_tmpfs . So regarding /tmp, should we look into MOJO_TMPDIR in os-autoinst, automatic cleanup of /tmp, ignore it?
[13/11/2019 10:48:59] <coolo> coolo@f102#~>cat /etc/tmpfiles.d/tmp.conf [13/11/2019 10:48:59] <coolo> d /tmp 1777 root root 10d [13/11/2019 10:49:06] <coolo> for workers we should go with less days even
[13/11/2019 10:49:28] <coolo> but os-autoinst using /tmp is problematic in itself - we got pools on ssds and / on slow, small disks [13/11/2019 10:50:01] <coolo> so setting MOJO_TMPDIR to pool directory is due anyway - I wonder why tests would upload GBs to os-autoinst though [13/11/2019 10:50:08] <coolo> I mean these files weren't exactly small [13/11/2019 10:52:09] <okurz> you know our testers, maybe someone downloading from within tests? like these xen images? but not these, as we are on arm [13/11/2019 10:53:04] <sebastianriedel> Correct, it should be exclusively temporary uploads that were too large to put into memory and that were not cleaned up properly for "reasons" [13/11/2019 10:53:49] <sebastianriedel> (or that are still being processed of course) [13/11/2019 10:58:08] <coolo> with arm workers doing sudden deaths, I wouldn't worry about the cleanup part
- Status changed from New to Feedback
- Assignee set to okurz
- Target version set to Current Sprint
Looking at https://gitlab.suse.de/openqa/osd-deployment/-/jobs/176453 seems to reveal
sda3 (btrfs) cf. the
mapper/system-root (btrfs). Although it doesn't show what occupies that space... I gather the
ls used above is not used anywhere in the CI script.
You are referring to https://gitlab.suse.de/openqa/osd-deployment/-/jobs/176453#L28 which is an alert about openqaworker-arm-2, a worker machine, not a webui host. For this ticket I was waiting for our Scrum Master to remind kraih to update his ticket ;) When we confirmed the feature works we should remove the custom tmpfile cleanup in salt.