action #18608
open[qe-core][tools][sle][functional][research][medium] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring
100%
Description
Motivation¶
We saw a lot of failures recently due to full disk on s390pb which is caused due to the lack of proper clean up and monitoring
Acceptance criteria¶
- AC1: Disks on jump hosts do not run full with assets
- AC2: It's not just a custom script in custom cron job but linked to openQA
- AC3: Limit is set to 50%
Suggestions¶
- Ask mnowak if that is not also a problem for the hyperv host or maybe he has it already solved better since #18608#note-3?
- Harmonize existing solutions if existing, at least collect them here
- Come up with a proper approach covered by openQA, e.g. also mention it in the openQA or os-autoinst documentation regarding to jump hosts
Updated by mgriessmeier over 7 years ago
- Status changed from New to Resolved
workaround through a cronjob which is deleting unused qcow images on s390pb running every 6 hours
[root@s390pb images]# cat /usr/local/bin/cleanup-openqa-assets
#!/bin/sh -e
if [[ $(df | grep "/var/lib/libvirt/images" | awk '{print $5}' | sed "s/\%//") -gt 80 ]] ; then
find /var/lib/libvirt/images/*.qcow2 ! -exec sudo fuser -s "{}" 2>/dev/null \; -exec rm -f {} \;
fi
Updated by michalnowak over 7 years ago
Thanks! I just added similar script targeting qcow2, iso and img files on Xen & KVM openQA virt hosts.
Updated by mgriessmeier over 7 years ago
- Follows action #19080: [s390x][zkvm] test cases fails by no space left on device to download zkvm-image added
Updated by mgriessmeier over 7 years ago
- Status changed from Resolved to New
- Assignee deleted (
mgriessmeier) - Priority changed from Urgent to Normal
reopening, because apparently a cronjob is not the proper way of doing it - anyway, lowering priority and unassigning since I adjusted the cronjob to run more often and don't plan to work on it in the near future - feel free to take
Updated by okurz about 7 years ago
- Subject changed from [tools][sles][functional] Implement proper clean up for images on s390pb and a proper monitoring to [tools][sle][functional] Implement proper clean up for images on s390pb and a proper monitoring
- Due date set to 2018-01-30
- Target version set to Milestone 14
we might have an idea about it again when we discuss with others how to do it properly.
Updated by okurz almost 7 years ago
- Due date changed from 2018-01-30 to 2018-02-13
M14 only starts after 2018-01-30
Updated by okurz almost 7 years ago
- Subject changed from [tools][sle][functional] Implement proper clean up for images on s390pb and a proper monitoring to [tools][sle][functional] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring
- Description updated (diff)
- Due date deleted (
2018-02-13) - Status changed from New to Workable
- Target version deleted (
Milestone 14)
@coolo is it "Ready"?
Updated by okurz almost 7 years ago
ok, great. http://lord.arch/tests/479 failed for me with
[2018-02-08T07:52:53.0208 CET] [debug] MATCH(rebootnow-390x-20160506:0.00)
[2018-02-08T07:52:53.0338 CET] [debug] MATCH(install_and_reboot-additional-packages-20170823:0.00)
[2018-02-08T07:52:53.0343 CET] [debug] no match: 3905.2s
[2018-02-08T07:52:53.0849 CET] [debug] no change: 3904.2s
[2018-02-08T07:52:54.0849 CET] [debug] no change: 3903.2s
[2018-02-08T07:52:55.0849 CET] [debug] no change: 3902.2s
[2018-02-08T07:52:56.0350 CET] [debug] considering VNC stalled, no update for 4.18 seconds
DIE Error connecting to host <10.161.145.7>: IO::Socket::INET: connect: Connection timed out
at /usr/lib/os-autoinst/backend/baseclass.pm line 80.
backend::baseclass::die_handler('OpenQA::Exception::VNCSetupError=HASH(0x6f2c320)') called at /usr/lib/perl5/vendor_perl/5.18.2/Exception/Class/Base.pm line 85
Exception::Class::Base::throw('OpenQA::Exception::VNCSetupError', 'error', 'Error connecting to host <10.161.145.7>: IO::Socket::INET: co...') called at /usr/lib/os-autoinst/consoles/VNC.pm line 151
consoles::VNC::login('consoles::VNC=HASH(0x6f2ac48)') called at /usr/lib/os-autoinst/consoles/VNC.pm line 842
consoles::VNC::send_update_request('consoles::VNC=HASH(0x6f2ac48)') called at /usr/lib/os-autoinst/consoles/vnc_base.pm line 82
consoles::vnc_base::request_screen_update('consoles::vnc_base=HASH(0x55da488)', undef) called at /usr/lib/os-autoinst/backend/baseclass.pm line 598
backend::baseclass::bouncer('backend::svirt=HASH(0x7a7c2d8)', 'request_screen_update', undef) called at /usr/lib/os-autoinst/backend/baseclass.pm line 581
backend::baseclass::request_screen_update('backend::svirt=HASH(0x7a7c2d8)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 177
eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 156
backend::baseclass::run_capture_loop('backend::svirt=HASH(0x7a7c2d8)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 129
backend::baseclass::run('backend::svirt=HASH(0x7a7c2d8)', 5, 8) called at /usr/lib/os-autoinst/backend/driver.pm line 85
backend::driver::start('backend::driver=HASH(0x6cbe360)') called at /usr/lib/os-autoinst/backend/driver.pm line 48
backend::driver::new('backend::driver', 'svirt') called at /usr/bin/isotovideo line 211
main::init_backend() called at /usr/bin/isotovideo line 280
[2018-02-08T07:55:03.0597 CET] [debug] Destroying openQA-SUT-12 virtual machine
Reason: Space depleted :(
Updated by okurz over 6 years ago
- Related to action #32932: [sle][functional][u][hyperv] test fails in logs_from_installation_system - Increase timeout for uploading logs added
Updated by okurz over 6 years ago
- Related to action #32926: [sle][functional][y][hyperv][medium] avoid typing username before switched tty (was: test fails in yast2_i - (mising needles?, rather too low timeout for hyperv) for Installation Report succesful) added
Updated by okurz over 6 years ago
- Related to action #32929: [sle][functional][u][hyperv] test fails in postgresql_server - SubState=running not found added
Updated by okurz over 6 years ago
- Related to action #31507: extend storage for /var/lib/libvirt/images on s390pb added
Updated by okurz over 6 years ago
- Subject changed from [tools][sle][functional] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring to [tools][sle][u][functional] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring
- Due date set to 2018-07-03
- Target version changed from Milestone 16 to Milestone 17
-> S20
Updated by okurz over 6 years ago
- Target version changed from Milestone 17 to Milestone 17
Updated by riafarov over 6 years ago
- Subject changed from [tools][sle][u][functional] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring to [tools][sle][u][functional][research][medium] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring
- Priority changed from Normal to Low
Next step: try to come up with better solution, if it's possible and then propose solution. As of now we don't have better solution in mind.
Lowering priority for this sprint.
Updated by mgriessmeier over 6 years ago
- Start date set to 2017-04-29
due to changes in a related task
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-07-03 to 2018-07-31
low prio, due to hackweek - moving to sprint 22
Updated by okurz over 6 years ago
- Due date deleted (
2018-07-31) - Target version changed from Milestone 17 to future
Updated by okurz over 6 years ago
https://bugzilla.suse.com/show_bug.cgi?id=1103826 created about problems of all suse-kvm tests failing because of disk full, RESOLVED INVALID by mgriessmeier relating to this ticket here – rightly so.
Updated by okurz over 5 years ago
- Project changed from openQA Project (public) to openQA Infrastructure (public)
- Category deleted (
168)
Updated by SLindoMansilla over 5 years ago
- Start date set to 2017-04-29
due to changes in a related task
Updated by SLindoMansilla over 5 years ago
- Start date set to 2017-04-29
due to changes in a related task
Updated by SLindoMansilla over 5 years ago
- Target version changed from future to Milestone 27
Updated by SLindoMansilla over 5 years ago
- Start date set to 2017-04-29
due to changes in a related task
Updated by mgriessmeier about 5 years ago
- Start date set to 2017-04-29
due to changes in a related task
Updated by mgriessmeier about 5 years ago
- Target version changed from Milestone 27 to Milestone 30+
Updated by mgriessmeier almost 5 years ago
- Start date set to 2017-04-29
due to changes in a related task
Updated by mgriessmeier almost 5 years ago
- Target version changed from Milestone 30+ to Milestone 35+
Updated by tjyrinki_suse about 4 years ago
- Subject changed from [tools][sle][u][functional][research][medium] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring to [qe-core][tools][sle][functional][research][medium] Implement proper clean up for images on jump hosts, e.g. s390pb, hyperv host, svirt and a proper monitoring
- Parent task deleted (
#17574)
Updated by szarate over 2 years ago
- Target version changed from Milestone 35+ to future
Updated by okurz 10 months ago
- Related to action #154180: Proper kvm asset cleanup for s390x kvm backend (svirt) and tests added