Actions
action #97979
closedopenQA Infrastructure - action #97976: [alert] OSD file systems - assets
Asset cleanup takes very long to process 60k files in "other" size:M
Description
Motivation¶
See #97976 . Attaching with strace to a gru process for asset cleanup shows that it takes very long to traverse files in the "other" directory (currently 60k files on OSD) which are all not big but seem to take a lot of time to traverse and also we process files seemingly in reverse alphabetic order
stat("/var/lib/openqa/share/factory/other/06577642-cve", {st_mode=S_IFREG|0644, st_size=1821, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577642-cve", {st_mode=S_IFREG|0644, st_size=1821, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577642-cve", {st_mode=S_IFREG|0644, st_size=1821, ...}) = 0
getpid() = 22883
stat("/var/lib/openqa/share/factory/other/06577641-uevent", {st_mode=S_IFREG|0644, st_size=54, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577641-uevent", {st_mode=S_IFREG|0644, st_size=54, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577641-uevent", {st_mode=S_IFREG|0644, st_size=54, ...}) = 0
getpid() = 22883
stat("/var/lib/openqa/share/factory/other/06577641-numa", {st_mode=S_IFREG|0644, st_size=538, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577641-numa", {st_mode=S_IFREG|0644, st_size=538, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577641-numa", {st_mode=S_IFREG|0644, st_size=538, ...}) = 0
getpid() = 22883
stat("/var/lib/openqa/share/factory/other/06577640-tracing", {st_mode=S_IFREG|0644, st_size=345, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577640-tracing", {st_mode=S_IFREG|0644, st_size=345, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577640-tracing", {st_mode=S_IFREG|0644, st_size=345, ...}) = 0
getpid() = 22883
stat("/var/lib/openqa/share/factory/other/06577635-pty", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577635-pty", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
stat("/var/lib/openqa/share/factory/other/06577635-pty", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
…
stat("/var/lib/openqa/share/factory/repo/SLE-12-SP5-SDK-POOL-s390x-Build0154-Media1.license", {st_mode=S_IFDIR|0755, st_size=59, ...}) = 0
stat("/var/lib/openqa/share/factory/repo/SLE-12-SP5-SDK-POOL-s390x-Build0154-Media1.license", {st_mode=S_IFDIR|0755, st_size=59, ...}) = 0
stat("/var/lib/openqa/share/factory/repo/SLE-12-SP5-SDK-POOL-s390x-Build0154-Media1.license", {st_mode=S_IFDIR|0755, st_size=59, ...}) = 0
getpid() = 22883
stat("/var/lib/openqa/share/factory/repo/SLE-12-SP5-SDK-POOL-s390x-Build0151-Media1.license", {st_mode=S_IFDIR|0755, st_size=59, ...}) = 0
stat("/var/lib/openqa/share/factory/repo/SLE-12-SP5-SDK-POOL-s390x-Build0151-Media1.license", {st_mode=S_IFDIR|0755, st_size=59, ...}) = 0
stat("/var/lib/openqa/share/factory/repo/SLE-12-SP5-SDK-POOL-s390x-Build0151-Media1.license", {st_mode=S_IFDIR|0755, st_size=59, ...}) = 0
getpid() = 22883
which seems to have the following results:
- We spend very long time processing "other" where we hardly delete anything before we go to "iso" or "hdd" with bigger files
Suggestion¶
- Investigate if we really want to do Z-A sorting i.e. traverse more recent files by build number - or can we check mtime?
- Keep track of files already processed between runs (can we do that?)
- Reconsider ionicing the cleanup
Acceptance criteria¶
- AC1: Asset cleanup on OSD takes less time until it actually starts deleting files within one asset cleanup task
AC2: There are less files kept in "other"--> #99420
Actions