Project

General

Profile

action #124212

Updated by livdywan about 1 year ago

## Observation 

 [openqa_install+publish jobs in openQA in OpenQA tests](https://openqa.opensuse.org/tests/3105702#step/dashboard/9) started failing, which caused a wave of **Unreviewed issue (Group 24 openQA)** emails to be sent every half hour or so. 

 The emails contain this: 

 ``` 
 # --- 8< --- 
 # [2023-02-09T09:03:50.690315+01:00] [debug] [pid:21411] QEMU status is not 'shutdown', it is 'running' 
 # [2023-02-09T09:03:50.690400+01:00] [debug] [pid:21268] backend shutdown state:  
 # [2023-02-09T09:03:50.690645+01:00] [info] [pid:21411] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json 
 # [2023-02-09T09:03:51.741976+01:00] [debug] [pid:21411] Passing remaining frames to the video encoder 
 # frame056 fps=0 q=0 Lsize!41kB time:02:07.29 bitrate7.8kbits/s speed=083x     
 # video:2122kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.936463% 
 # [2023-02-09T09:03:54.217738+01:00] [debug] [pid:21411] Waiting for video encoder to finalize the video 
 # [2023-02-09T09:03:54.217801+01:00] [debug] [pid:21411] The external video encoder (pid 21537) terminated 
 # [2023-02-09T09:03:54.217841+01:00] [debug] [pid:21411] The built-in video encoder (pid 21538) terminated 
 # [2023-02-09T09:03:54.218255+01:00] [debug] [pid:21411] QEMU: qemu-system-x86_64: terminating on signal 15 from pid 21411 (/usr/bin/isotovideo: backen) 
 # --- >8 --- 
 ``` 

 The `SIGTERM` is expected here. More relevant messages can also be found in the log: 

 ``` 
 [2023-02-09T10:36:43.583422+01:00] [debug] [pid:12662] no match: -0.9s, best candidate: openqa-dashboard-no_jobs-tumbleweed-20200106 (0.00) 
 [2023-02-09T10:36:43.892722+01:00] [debug] [pid:12502] >>> testapi::_check_backend_response: match=openqa-dashboard timed out after 360 (assert_screen) 
 [2023-02-09T10:36:43.986352+01:00] [info] [pid:12502] ::: basetest::runtest: # Test died: no candidate needle with tag(s) 'openqa-dashboard' matched 
 ``` 

 ## Acceptance criteria 
 - **AC1**: There is always a helpful hint explaining why the email was sent 
 - **AC2**: Jobs are not built so frequently that they cause email floods 

 ## Out of scope 
 - A failure in the same job doesn't cause repeated emails 

 ## Suggestions 
 - Verify that a comment like `poo#124143` prevents an "unreview issue" from being detected 
 - Investigate what error message if any triggered the script 
 - Add a note on why the job is "unreviewed" i.e. unreviewed always means no bug ref 
 - Always include `no candidate needle with tag(s)` messages in the email 
 - Consider explicitly treating needle mismatches as "reviewed" 
 - There's `[2023-02-09T10:37:10.650093+01:00] [warn] [pid:12502] !!! testapi::script_run: DEPRECATED call of script_run() in lib/openQAcoretest.pm:8 add `die_on_timeout => ?` to the call or set $distri->{script_run_die_on_timeout} to avoid this warning` which could be seen as an unreviewed error message but it's not seen in the snippet 
 - Check if this could be an unintended side-effect of #98862 
 - Check the openQA-in-openQA trigger-test-monitor pipeline in jenkins.qa.suse.de/ , maybe we trigger too many 

Back