Actions
action #94318
closedhttp://jenkins.qa.suse.de not reachable
Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2021-04-26
Due date:
% Done:
0%
Estimated time:
Description
Actions
Added by okurz over 3 years ago. Updated over 3 years ago.
0%
Description
Did sudo su
and found with df -h
that /
is full. Did snapper ls
# | Type | Pre # | Date | User | Used Space | Cleanup | Description | Userdata
-----+--------+-------+--------------------------+------+------------+---------+-----------------------+--------------
0 | single | | | root | | | current |
1* | single | | Thu Oct 19 19:40:57 2017 | root | 105.70 MiB | | first root filesystem |
184 | pre | | Mon May 17 12:45:23 2021 | root | 93.90 MiB | number | zypp(zypper) | important=yes
185 | post | 184 | Mon May 17 12:46:46 2021 | root | 8.89 MiB | number | | important=yes
208 | pre | | Sun Jun 6 12:45:22 2021 | root | 4.31 MiB | number | zypp(zypper) | important=yes
209 | post | 208 | Sun Jun 6 12:46:14 2021 | root | 15.76 MiB | number | | important=yes
216 | pre | | Sun Jun 13 03:00:07 2021 | root | 2.08 MiB | number | zypp(zypper) | important=no
217 | post | 216 | Sun Jun 13 03:00:25 2021 | root | 2.05 MiB | number | | important=no
218 | pre | | Sun Jun 20 03:00:24 2021 | root | 88.91 MiB | number | zypp(zypper) | important=yes
Did snapper rm 184-217
to get some space. After that did mount -o rw,remount /
and triggered a zypper dup
to ensure a consistent upgraded step.
I also encountered a process running wild:
6958 jenkins B4 20 0 3510M 1646M 640 D 845 N/A 23.9 83.2 1:27.14 /usr/bin/perl /usr/share/openqa/script/client --host https://openqa.opensuse.org jobs/
eating up all memory.
The problem about the accumulating space could be mainly due to big updates especially kernel updates where old versions are never deleted because we do not automatically reboot so we need automatic reboots triggered.
Using https://github.com/okurz/scripts/blob/master/opensuse-install-auto-update I extended /etc/cron.d/auto-update now with && needs-restarting --reboothint >/dev/null || (command -v rebootmgrctl >/dev/null && rebootmgrctl reboot ||:)' > /etc/cron.d/auto-update
to trigger a reboot when updates demand it. Also did systemctl enable --now rebootmgr
and triggered an explicit reboot now.
After the reboot the older kernel versions have been removed, so that only two versions (current and previous) remain. But / is still 99% used. I think it's problematic that the filesystem layout is still old and /var/lib/jenkins
is not on a subvolume so it's content will be included in every snapshot.
So creating a new btrfs subvol and moving the existing content there:
systemctl stop jenkins
mount /dev/sda2 -o subvol=@ /mnt
btrfs subvol create /mnt/var/lib/jenkins
echo 'UUID=0b8a4aea-1f96-4b60-b305-57a7317d0067 /var/lib/jenkins btrfs subvol=@/var/lib/jenkins 0 0' >> /etc/fstab
mv /var/lib/jenkins/* /mnt/var/lib/jenkins/
umount /mnt
and now trying to reclaim more space, e.g. with systemctl start btrfs-balance btrfs-scrub btrfs-trim
That reclaimed a bit. I also deleted manually some older logfiles in /var/log. Now we have
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 38G 29G 9.3G 76% /
that should suffice for some time