Project

General

Profile

action #106912

Updated by livdywan almost 3 years ago

### Observation 

 This is is a shortened version of the relevant log: 

 ``` 
     # [2022-02-15T12:30:06.683729Z] [debug] ||| starting boot tests/boot.pm 
     # [2022-02-15T12:30:06.685985Z] [debug] tests/boot.pm:9 called testapi::assert_screen 
     # [2022-02-15T12:30:06.686361Z] [debug] <<< testapi::assert_screen(mustmatch="core", timeout=15, no_wait=1) 
     # [2022-02-15T12:30:06.700419Z] [debug] no match: 14.9s, best candidate: boot-core (0.00) 
 … 
     # [2022-02-15T12:30:07.125518Z] [debug] no change: 14.5s 
     # [2022-02-15T12:30:07.134823Z] [debug] no match: 14.5s, best candidate: boot-core (0.00) 
     # [2022-02-15T12:30:07.331434Z] [debug] >>> testapi::_handle_found_needle: found boot-core, similarity 0.97 @ 1/74 
     # [2022-02-15T12:30:07.332261Z] [debug] tests/boot.pm:10 called testapi::send_key 
     # [2022-02-15T12:30:07.332397Z] [debug] <<< testapi::send_key(key="ret", wait_screen_change=0, do_wait=0) 
     # [2022-02-15T12:30:07.602261Z] [debug] tests/boot.pm:14 called testapi::assert_screen 
     # [2022-02-15T12:30:07.602421Z] [debug] <<< testapi::assert_screen(mustmatch="on_prompt", timeout=90) 
     # [2022-02-15T12:30:08.241080Z] [debug] no match: 89.3s, best candidate: boot-on_prompt (0.00) 
 … 
     # [2022-02-15T12:30:21.249634Z] [debug] no match: 76.3s, best candidate: boot-on_prompt (0.00) 
     # [2022-02-15T12:30:22.370453Z] [debug] >>> testapi::_handle_found_needle: found boot-on_prompt, similarity 1.00 @ 2/64 
     # [2022-02-15T12:30:22.374020Z] [debug] ||| finished boot tests (runtime: 16 s) 
     # [2022-02-15T12:30:22.375138Z] [debug] ||| starting assert_screen tests/assert_screen.pm 
     # [2022-02-15T12:30:22.375777Z] [debug] tests/assert_screen.pm:9 called testapi::assert_screen 
     # [2022-02-15T12:30:22.375889Z] [debug] <<< testapi::assert_screen(mustmatch="on_prompt", timeout=60) 
     # [2022-02-15T12:30:23.323411Z] [debug] >>> testapi::_handle_found_needle: found boot-on_prompt, similarity 1.00 @ 2/64 
     # [2022-02-15T12:30:23.323585Z] [debug] tests/assert_screen.pm:10 called testapi::assert_screen 
     # [2022-02-15T12:30:23.323700Z] [debug] <<< testapi::assert_screen(mustmatch="on_prompt", timeout=30, no_wait=1) 
     # [2022-02-15T12:30:23.397251Z] [debug] >>> testapi::_handle_found_needle: found boot-on_prompt, similarity 1.00 @ 2/64 
     # [2022-02-15T12:30:23.398459Z] [debug] ||| finished assert_screen tests (runtime: 1 s) 
     # [2022-02-15T12:30:23.399556Z] [debug] ||| starting shutdown tests/shutdown.pm 
     # [2022-02-15T12:30:23.400299Z] [debug] tests/shutdown.pm:8 called testapi::type_string 
     # [2022-02-15T12:30:23.400468Z] [debug] <<< testapi::type_string(string="sudo su\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47) 
     # [2022-02-15T12:30:23.671645Z] [debug] tests/shutdown.pm:9 called testapi::type_string 
     # [2022-02-15T12:30:23.671846Z] [debug] <<< testapi::type_string(string="poweroff\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47) 
     # [2022-02-15T12:30:23.976613Z] [debug] tests/shutdown.pm:10 called testapi::assert_shutdown 
     # [2022-02-15T12:30:23.976768Z] [debug] <<< testapi::check_shutdown(timeout=90) 
     # [2022-02-15T12:30:23.977601Z] [debug] QEMU status is not 'shutdown', it is 'running' 
     # [2022-02-15T12:30:24.978574Z] [debug] EVENT {"data":{"path":"\/machine\/peripheral-anon\/device[2]\/virtio-backend"},"event":"NIC_RX_FILTER_CHANGED","timestamp":{"microseconds":798568,"seconds":1644928224}} 
     # [2022-02-15T12:30:24.978769Z] [debug] QEMU status is not 'shutdown', it is 'running' 
 … 
     # [2022-02-15T12:31:54.110983Z] [debug] QEMU status is not 'shutdown', it is 'running' 
     # [2022-02-15T12:31:55.174986Z] [info] ::: basetest::runtest: # Test died: Machine didn't shut down! at tinycore/tests/shutdown.pm line 10. 
     #    
     # [2022-02-15T12:31:55.177594Z] [debug] ||| finished shutdown tests (runtime: 92 s) 
     # [2022-02-15T12:31:55.177662Z] [debug] ||| post fail hooks runtime: 0 s 
 ``` 

 The test run was conducted as "job 7" as part of openQA's fullstack test (with caching enabled). I suppose it should also be reproducible in other "fullstack scenarios" where we expect the test to pass. Maybe os-autoinst's own fullstack test is not affected because it runs a few more test modules before `shutdown`. The change https://github.com/os-autoinst/os-autoinst/commit/a7103b16 was supposed to fix that and it definitely improved the situation. However, it seems in very rare cases¹ the test can still fail. (The fix was definitely used here as one can see in the logs that the `assert_screen` module is successfully executed.) 

 ¹ I ran openQA's fullstack test about 75 times in CircleCI until I hit an occurrence. 

 ## Acceptance criteria 
 - **AC1**: openQA fullstack test runs 100 times without failing 

 ## Suggestions 
 - Not Ondrej's fault 
 - Make it fail on purpose... but we don't know the exact cause, only the symptom 
 - Reveal the video from the pool directory after reproducing locally 

 ## Out of scope 

 - Reveal the video from the pool directory in CI

Back