File systems alert 90.256 assets used size:M
[Alerting] File systems alert One of the file systems is too full Metric name Value /assets: Used Percentage 90.256
- Find assets to delete
Use archive feature to move assets?
- See if the cleanup ran properly
- Status changed from Workable to Resolved
- Assignee set to mkittler
I've actually been handling this, see my mails. The alert is now ok again. There was nothing really broken; the asset cleanup was just postpone for too long (but in a way which is expected).
I've been asking myself the following questions on how to improve this in the future:
- Maybe we could also change the locking to allow running the cleanup of assets and results concurrently? In our setup results and assets are on different disks so running both at the same time shouldn't be counterproductive and in this case it would have helped. In fact I resolved the issue by manually deleting the
limit_taskslock to let the asset cleanup run in parallel with the result cleanup.
- The last 3 asset cleanup jobs which could have actually ran did not because at this point the threshold hasn't been reached and therefore the cleanup has been skipped. The same counts for the result cleanup which ran before the currently active one. It was skipped because we were under the threshold but that's likely contributing to the fact the todays cleanup is taking very long. Maybe we should rethink postpone the cleanup according to the thresholds (at least in its current form)?