Project

General

Profile

Actions

action #40157

closed

Running out of space in openqa.suse.de for /var/lib/openqa

Added by szarate over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Feature requests
Target version:
Start date:
2018-08-23
Due date:
% Done:

0%

Estimated time:

Description

Since few days ago, we've been trying to cleanup some space, and looks like a lot of this space is in our policy of keeping test results for quite some time, as with every new SP or version of the operating systems, we tend to grow quite fast.

following ticket was opened: https://infra.nue.suse.com/Ticket/Display.html?id=119536
https://nagios-devel.suse.de/pnp4nagios//index.php/graph?host=openqa.suse.de&srv=fs_%2Fvar%2Flib%2Fopenqa&theme=smoothness


Files

pnp4nagios.pdf (102 KB) pnp4nagios.pdf szarate, 2018-08-23 07:12
Actions #1

Updated by szarate over 6 years ago

Got an update from infra: If possible perhaps by tomorrow we might get an answer.

Actions #2

Updated by szarate over 6 years ago

I got this also but is unrelated to /var/lib/openqa as the logs are written to /var/log/openqa_scheduler

Aug 23 12:46:23 openqa openqa-scheduler[23341]: Uncaught exception from user code:
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Can't write to log: No space left on device at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 326.
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Mojo::Log::append('Mojo::Log=HASH(0x210f370)', '[2018-08-23T12:46:23.0687 CEST] [debug] [pid:23341] Could not...') called at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Log.pm line 70
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Mojo::Log::_message('Mojo::Log=HASH(0x210f370)', 'debug', '[pid:23341] Could not get new jobs to allocate: {UNKNOWN}: Ca...') called at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/EventE
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Mojo::EventEmitter::emit('Mojo::Log=HASH(0x210f370)', 'message', 'debug', '[pid:23341] Could not get new jobs to allocate: {UNKNOWN}: Ca...') called at /usr/lib/perl5/vendor_perl/5.
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Mojo::Log::_log('Mojo::Log=HASH(0x210f370)', 'debug', '[pid:23341] Could not get new jobs to allocate: {UNKNOWN}: Ca...') called at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/Log.pm lin
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Mojo::Log::debug('Mojo::Log=HASH(0x210f370)', '[pid:23341] Could not get new jobs to allocate: {UNKNOWN}: Ca...') called at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 326
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Utils::_log('debug', '[pid:23341] Could not get new jobs to allocate: {UNKNOWN}: Ca...') called at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 312
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Utils::_log_msg('debug', 'Could not get new jobs to allocate: {UNKNOWN}: Can\'t write t...', 'channels', 'ARRAY(0x327aba8)', 'standard', 1) called at /usr/share/openqa/scrip
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Utils::_log_msg('debug', 'Could not get new jobs to allocate: {UNKNOWN}: Can\'t write t...') called at /usr/share/openqa/script/../lib/OpenQA/Utils.pm line 251
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Utils::log_debug('Could not get new jobs to allocate: {UNKNOWN}: Can\'t write t...') called at /usr/share/openqa/script/../lib/OpenQA/Scheduler/Scheduler.pm line 290
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Scheduler::Scheduler::catch {...} ('DBIx::Class::Exception=HASH(0x99e9d10)') called at /usr/lib/perl5/vendor_perl/5.18.2/Try/Tiny.pm line 115
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Try::Tiny::try('CODE(0x9538b28)', 'Try::Tiny::Catch=REF(0x9278618)') called at /usr/share/openqa/script/../lib/OpenQA/Scheduler/Scheduler.pm line 293
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Scheduler::Scheduler::schedule() called at /usr/lib/perl5/vendor_perl/5.18.2/x86_64-linux-thread-multi/Net/DBus/Callback.pm line 119
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Net::DBus::Callback::invoke('Net::DBus::Callback=HASH(0xc48a300)') called at /usr/lib/perl5/vendor_perl/5.18.2/x86_64-linux-thread-multi/Net/DBus/Reactor.pm line 385
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Net::DBus::Reactor::step('Net::DBus::Reactor=HASH(0x210ee78)') called at /usr/lib/perl5/vendor_perl/5.18.2/x86_64-linux-thread-multi/Net/DBus/Reactor.pm line 325
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         Net::DBus::Reactor::run('Net::DBus::Reactor=HASH(0x210ee78)') called at /usr/share/openqa/script/../lib/OpenQA/Scheduler.pm line 102
Aug 23 12:46:23 openqa openqa-scheduler[23341]:         OpenQA::Scheduler::run() called at /usr/share/openqa/script/openqa-scheduler line 38

Actions #3

Updated by okurz over 6 years ago

I don't know a better way than calling du on /var/lib/openqa to find out where the space is used, if it's assets, test results or something else that was unforeseen.

Actions #4

Updated by szarate over 6 years ago

I'm guessing it's matter of properly calculating, because doing a simple du can run for 12 hours and get nothing :)

Actions #5

Updated by szarate over 6 years ago

I'm not sure how important this one is, but I have removed hdd/fixed/SLE-12-SP3-Server-Build0450-allpatterns-for-balance-old.qcow2 from there, got it on my local machine (will make it available later)

openqa:/var/lib/openqa # du -chs share/factory/*
1.5T    share/factory/hdd
399G    share/factory/iso
1.8G    share/factory/other
766G    share/factory/repo
7.7M    share/factory/tmp
2.6T    total

Current space is 9.0T, of which we have 6.9T being used (that's about 76%), that leaves us with 4.3T in just test results... We need to think about or retention policy, I already made Sebastian aware of it.

Actions #6

Updated by szarate over 6 years ago

  • Status changed from Blocked to In Progress

As a clarification: Infra added 2T for us, but I got the message that we're getting to the limits for their workloads (Whatever that means).

Actions #7

Updated by szarate over 6 years ago

  • Status changed from In Progress to Resolved

xfs_growfs /var/lib/openqa did the trick after infra ticket was resolved. We're at 76% capacity.

Actions #8

Updated by szarate over 6 years ago

  • Subject changed from Running out of space in openqa.suse.de to Running out of space in openqa.suse.de for /var/lib/openqa
Actions #9

Updated by coolo about 6 years ago

  • Target version changed from Current Sprint to Done
Actions

Also available in: Atom PDF