Project

General

Profile

action #70876

Updated by okurz over 3 years ago

### Observation observation 

 Not sure why this job incompleted: 

 ``` 
 [2020-09-02T00:15:10.754 CEST] [debug] Current version is 4.6.1598623827.924c995d [interface v20] 
 [2020-09-02T00:15:10.840 CEST] [debug] git hash in /var/lib/openqa/cache/openqa1-opensuse/tests/opensuse: ab7782333c508e136760d4299a0c32297a5c38fc 
 [2020-09-02T00:15:11.276 CEST] [debug] usingenv DESKTOP=kde 
 … 
 [2020-09-02T00:15:11.278 CEST] [debug] usingenv NOIMAGES=1 
 [2020-09-02T00:15:11.294 CEST] [debug] scheduling isosize tests/installation/isosize.pm 
 … 
 [2020-09-02T00:15:11.520 CEST] [debug] scheduling shutdown tests/shutdown/shutdown.pm 
 [2020-09-02T00:15:11.577 CEST] [info] cmdsrv: daemon reachable under http://*:20033/FDK5NTQyiHRYvqGw/ 
 [2020-09-02T00:15:11.579 CEST] [debug] git hash in /var/lib/openqa/cache/openqa1-opensuse/tests/opensuse/products/opensuse/needles: 73ebc3acc3c3cd24f09d9f6c15edc470eba6e5eb 
 [2020-09-02T00:15:11.580 CEST] [debug] init needles from /var/lib/openqa/cache/openqa1-opensuse/tests/opensuse/products/opensuse/needles 
 [2020-09-02T00:15:11.583 CEST] [info] Listening at "http://[::]:20033" 
 Web application available at http://[::]:20033 
 [2020-09-02T00:15:12.201 CEST] [debug] inst-console-20200224 contains inst-console twice 
 [2020-09-02T00:15:13.370 CEST] [debug] loaded 8184 needles 
 [2020-09-02T00:15:13.634 CEST] [debug] Blocking SIGTERM 
 [2020-09-02T00:15:13.721 CEST] [debug] Unblocking SIGTERM 
 5575: channel_out 15, channel_in 14 
 [2020-09-02T00:15:13.990 CEST] [debug] Blocking SIGTERM 
 [2020-09-02T00:15:14.122 CEST] [debug] Unblocking SIGTERM 
 5966: cmdpipe 13, rsppipe 16 
 [2020-09-02T00:15:14.125 CEST] [debug] started mgmt loop with pid 5966 
 remove_tree qemuscreenshot 
 [2020-09-02T00:15:14.382 CEST] [debug] qemu version detected: 3.1.1.1 
 [2020-09-02T00:15:14.709 CEST] [debug] running /usr/bin/chattr -f +C /var/lib/openqa/pool/3/raid 
 [2020-09-02T00:15:15.238 CEST] [debug] running /usr/bin/qemu-img info --output=json /var/lib/openqa/pool/3/openSUSE-Jump-15.2-DVD-aarch64-Build72.2-Media.iso 
 [2020-09-02T00:15:15.443 CEST] [debug] Driver backend collected unknown process with pid 6596 and exit status: 0 
 [2020-09-02T00:15:15.477 CEST] [debug] backend process exited: 0 
 failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 128. 
 [2020-09-02T00:15:15.485 CEST] [debug] stopping command server 5612 because test execution ended through exception 
 [2020-09-02T00:15:16.690 CEST] [debug] done with command server 
 [2020-09-02T00:15:16.690 CEST] [debug] stopping autotest process 5645 
 [2020-09-02T00:15:17.893 CEST] [debug] done with autotest process 
 5575: EXIT 1 
 [2020-09-02T00:15:18.0121 CEST] [info] Isotovideo exit status: 1 
 [2020-09-02T00:15:18.0153 CEST] [info] +++ worker notes +++ 
 [2020-09-02T00:15:18.0154 CEST] [info] End time: 2020-09-01 22:15:18 
 [2020-09-02T00:15:18.0154 CEST] [info] Result: died 
 [2020-09-02T00:15:18.0238 CEST] [info] Uploading video.ogv 
 [2020-09-02T00:15:18.0319 CEST] [info] Uploading video_time.vtt 
 [2020-09-02T00:15:18.0355 CEST] [info] Uploading vars.json 
 [2020-09-02T00:15:18.0381 CEST] [info] Uploading autoinst-log.txt 
 ``` 

 The message `isotovideo died: failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 128.` is saved as reason but I'm not sure whether it is really the source of the problem or just a symptom. 

 --- 

 job reference: https://openqa.opensuse.org/tests/1380901 

 ## Steps to reproduce 

 Find reproducing jobs on production with 

 ``` 
 for i in o3 osd; do ssh $i "sudo -u geekotest psql openqa -c \"select id,t_finished,test,reason from jobs where t_finished >= '2020-09-01' and reason ~ 'isotovideo died: failed to start VM' order by t_finished desc limit 20;\""; done 
 ``` 

 ## Expected result 
 Problem does not happen anymore or if it happens there is more helpful output. 

 ## Suggestions 
 * Check current test code coverage in os-autoinst for the error handling of backend/driver.pm:128 with the line `$self->_send_json({cmd => 'start_vm'}) || die "failed to start VM";` 
 * Ensure the code is properly covered with tests 
 * Try to reproduce the above problem 
 * Improve feedback in these cases 

 ## Workaround 
 Retrigger

Back