host=openqa.opensuse.org; worker=imagetester; failed_since=2019-09-30; for i in $(ssh $host "sudo -u geekotest psql --no-align --tuples-only --command=\"select id from jobs where (assigned_worker_id in (select id from workers where host='$worker' and result='incomplete' and t_finished >= '$failed_since'));\" openqa"); do openqa-client --host $host jobs/$i/restart post; done
Checking when this might have started. Looking with journalctl -u openqa-worker@*
I found the first "Result: died" coming from worker with PID 2110, started around "Sep 30 03:34:24" , that is https://openqa.opensuse.org/tests/1044019/file/autoinst-log.txt showing
[2019-09-30T05:06:02.103 CEST] [debug] /var/lib/openqa/cache/openqa1-opensuse/tests/opensuse/tests/x11/sshxterm.pm:43 called testapi::type_string
[2019-09-30T05:06:02.103 CEST] [debug] <<< testapi::type_string(string='killall xterm
', max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2019-09-30T05:06:02.449 CEST] [debug] <<< testapi::assert_screen(mustmatch='generic-desktop', timeout=30)
libpng error: Write Error
[2019-09-30T05:06:03.828 CEST] [debug] >>> testapi::_handle_found_needle: found generic-desktop-kde-plasma512-leap15.1-aarch64-20190409, similarity 1.00 @ 2/733
[2019-09-30T05:06:03.830 CEST] [debug] ||| finished sshxterm x11 at 2019-09-30 03:06:03 (45 s)
Can't close(GLOB(0x5617dca65888)) filehandle: 'No space left on device' at /usr/lib/os-autoinst/bmwqemu.pm line 322
imagetester is configured for the pool using tmpfs but with only 64GB and current tests using often 40GB we are not able to sustain even more than one instance. We would have more room on /dev/sda:
/dev/sda1 3.6T 54G 3.4T 2% /var/lib/openqa/cache
tmpfs 64G 32K 64G 1% /var/lib/openqa/pool
I am not aware of recent changes regarding this.