openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842023-03-01T20:06:09ZopenSUSE Project Management Tool
Redmine openQA Project - action #125237 (Resolved): os-autoinst codecov check "fully_covered" returns 99%...https://progress.opensuse.org/issues/1252372023-03-01T20:06:09Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>So it looks like the problem first mentioned in <a href="https://github.com/os-autoinst/os-autoinst/pull/2260#issuecomment-1448414226" class="external">https://github.com/os-autoinst/os-autoinst/pull/2260#issuecomment-1448414226</a> shows up in other pull requests as well now. Who can understand why codecov says that we have 99.28% coverage in the "fully_covered" section as configured in <a href="https://github.com/os-autoinst/os-autoinst/blob/master/codecov.yml#L18" class="external">https://github.com/os-autoinst/os-autoinst/blob/master/codecov.yml#L18</a> even though <a href="https://app.codecov.io/gh/os-autoinst/os-autoinst/pull/2270/tree" class="external">https://app.codecov.io/gh/os-autoinst/os-autoinst/pull/2270/tree</a> shows all referenced paths to be covered by 100%?</p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<p>Seems to happen in all pull requests that are either opened anew or updated</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> It is known why the actual percentage doesn't match the expected 100%</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Bisect like it is done in <a href="https://github.com/os-autoinst/os-autoinst/pull/2271" class="external">https://github.com/os-autoinst/os-autoinst/pull/2271</a> and similar PRs</li>
<li>Click on "View details" in merged PRs to see the checks</li>
</ul>
openQA Project - action #123888 (Resolved): [os-autoinst] Clone retry attempts seem to be the wro...https://progress.opensuse.org/issues/1238882023-02-02T10:21:31Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>From <a href="https://openqa.opensuse.org/tests/3089742/logfile?filename=autoinst-log.txt" class="external">https://openqa.opensuse.org/tests/3089742/logfile?filename=autoinst-log.txt</a></p>
<pre><code>[2023-02-02T11:16:59.522322+01:00] [info] [pid:7469] ::: OpenQA::Isotovideo::Utils::clone_git: Cloning git URL 'https://github.com/os-autoinst/os-autoinst-distri-openQA.git'
[2023-02-02T11:16:59.522363+01:00] [info] [pid:7469] ::: OpenQA::Isotovideo::Utils::clone_git: Checking out git refspec/branch 'use_podman_everywhere'
[2023-02-02T11:16:59.997119+01:00] [debug] [pid:7469] Cloning into 'os-autoinst-distri-openQA'...
[2023-02-02T11:16:59.997200+01:00] [debug] [pid:7469] Clone failed, retries left: 1 of 2
[2023-02-02T11:17:04.997519+01:00] [debug] [pid:7469] Skipping to clone 'https://github.com/os-autoinst/os-autoinst-distri-openQA.git'; os-autoinst-distri-openQA already exists
[2023-02-02T11:17:05.008632+01:00] [debug] [pid:7469] git hash in /var/lib/openqa/pool/15/os-autoinst-distri-openQA: 552d068b96cf247e2191c4ee6af16dc8f016a4f8
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1</strong>: Clones are re-tried up to 2 times in case of failures</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Review the retry logic in os-autoinst</li>
<li>Extend unit test coverage</li>
</ul>
openQA Project - action #122929 (Resolved): [os-autoinst] Unhandled test output in t/18-backend-q...https://progress.opensuse.org/issues/1229292023-01-10T16:02:12Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>We want to have a clean TAP output with all output from tested application code to be captured. In <a href="https://github.com/os-autoinst/os-autoinst/actions/runs/3876337077/jobs/6610044151#step:3:618">https://github.com/os-autoinst/os-autoinst/actions/runs/3876337077/jobs/6610044151#step:3:618</a> we see unhandled output:</p>
<pre><code>7: [2023-01-09T17:53:01.568078Z] [debug] [pid:2148] >>> basetest::verify_sound_image: found foundneedle, similarity 100.00 @ 1/2
7: [2023-01-09T17:53:01.573017Z] [debug] [pid:2148] >>> basetest::verify_sound_image: failed to find /opt/t/data/frame2.ppm
7: [2023-01-09T17:53:01.577323Z] [debug] [pid:2148] >>> basetest::verify_sound_image: failed to find /opt/t/data/frame2.ppm
7: [2023-01-09T17:53:01.582419Z] [info] [pid:2148] ::: basetest::runtest: # Test died: test failure at ./t/17-basetest.t line 519.
7:
7: [2023-01-09T17:53:01.583510Z] [debug] [pid:2148] ignoring previously logged failure via developer mode
7: [2023-01-09T17:53:01.584096Z] [debug] [pid:2148] ||| finished basetest unknown (runtime: 0 s)
7: [17:53:02] ./t/17-basetest.t .......................... ok 1849 ms ( 0.01 usr 0.00 sys + 1.80 cusr 0.14 csys = 1.95 CPU)
7: [2023-01-09T17:53:03.661582Z] [warn] [pid:2151] !!! backend::qemu::_set_graphics_backend: QEMU_OVERRIDE_VIDEO_DEVICE_AARCH64 is deprecated, please set QEMU_VIDEO_DEVICE=VGA instead
7: [2023-01-09T17:53:03.662028Z] [warn] [pid:2151] !!! backend::qemu::_set_graphics_backend: QEMUVGA is deprecated, please set QEMU_VIDEO_DEVICE
7: [2023-01-09T17:53:03.662428Z] [warn] [pid:2151] !!! backend::qemu::_set_graphics_backend: QEMUVGA is deprecated, please set QEMU_VIDEO_DEVICE
7: [2023-01-09T17:53:03.662817Z] [warn] [pid:2151] !!! backend::qemu::_set_graphics_backend: Both QEMUVGA and QEMU_VIDEO_DEVICE set, ignoring deprecated QEMUVGA!
7: [17:53:06] ./t/18-backend-qemu.t ...................... ok 4308 ms ( 0.02 usr 0.00 sys + 3.85 cusr 0.47 csys = 4.34 CPU)
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> No unhandled output from t/18-backend-qemu.t</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Call <code>prove -I. t/18-backend-qemu.t</code> locally to reproduce</li>
<li>Check which part of the code triggers the output, e.g. by checking the output of <code>prove -v ...</code></li>
<li>Surround calls with according output capture code as we already have in many other cases, likely even in the same test module.</li>
</ul>
openQA Project - action #113141 (Resolved): [sporadic] OBS checks fail os-autoinst test "Calling ...https://progress.opensuse.org/issues/1131412022-07-01T08:41:08Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>From <a href="https://build.opensuse.org/package/live_build_log/devel:openQA/os-autoinst/openSUSE_Tumbleweed/aarch64" class="external">https://build.opensuse.org/package/live_build_log/devel:openQA/os-autoinst/openSUSE_Tumbleweed/aarch64</a></p>
<pre><code>[ 985s] 3:
[ 985s] 3: # Failed test 'Calling 'isotovideo --help' returns exit code 0'
[ 985s] 3: # at t/44-scripts.t line 29.
[ 985s] 3: # got: '31744'
[ 985s] 3: # expected: '0'
[ 985s] 3: # Output:
[ 987s] 3: # Looks like you failed 1 test of 6.
[ 987s] 3: [12:48:35] t/44-scripts.t ...........................
[ 987s] 3: ok 1 - Calling 'check_needles.pl --help' returns exit code 0
[ 987s] 3: ok 2 - Calling 'check_qemu_oom --help' returns exit code 0
[ 987s] 3: ok 3 - Calling 'imgsearch --help' returns exit code 0
[ 987s] 3: not ok 4 - Calling 'isotovideo --help' returns exit code 0
[ 987s] 3: ok 5 - Calling 'os-autoinst-openvswitch --help' returns exit code 0
[ 987s] 3: ok 6 - no (unexpected) warnings (via done_testing)
[ 987s] 3: 1..6
[ 987s] 3: Dubious, test returned 1 (wstat 256, 0x100)
[ 987s] 3: Failed 1/6 subtests
</code></pre>
<p>Seems to happen a lot lately but only on aarch64 and apparently not all the time, currently green at time of writing.<br>
Exit code 31744 >> 8 = 124, meaning "command terminated by SIGTERM". Maybe a race condition due to multiple background processes – which we shouldn't even spawn when we just want to call "--help" – which only really shows up on aarch64 as it's slower.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>For now exclude the test from OBS checks on aarch64</li>
<li>Check the "--help" route, and see what else might be running in the background</li>
<li>Fix the problem</li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<ul>
<li>Enable test on OBS checks again</li>
</ul>
openQA Project - action #104751 (Resolved): Extend "_SECRET_" variable handling to os-autoinst ba...https://progress.opensuse.org/issues/1047512022-01-10T09:12:30Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>We already don't write any variable with "<u>SECRET</u>" in the name to vars.json for security reasons. Within os-autoinst we have some security relevant data, e.g. passwords that we should likely treat the same.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Remote backend passwords don't appear in vars.json by default</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Call <code>git grep '_SECRET_'</code> to find all current handling of <u>SECRET</u> variables</li>
<li>Extend that to also look for <code>_PASSWORD</code></li>
<li>Ensure that the values for the backend passwords don't show up in vars.json, e.g. no IPMI_PASSWORD entry as in <a href="https://openqa.nue.suse.com/tests/7924361/file/vars.json" class="external">https://openqa.nue.suse.com/tests/7924361/file/vars.json</a></li>
<li>Consider what happens when cloning such jobs. Do they fail because the password is missing?</li>
</ul>
openQA Project - action #103584 (Resolved): job incompletes with exception in OpenCV code "Assert...https://progress.opensuse.org/issues/1035842021-12-07T07:38:24Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://openqa.opensuse.org/tests/2073991/logfile?filename=autoinst-log.txt" class="external">https://openqa.opensuse.org/tests/2073991/logfile?filename=autoinst-log.txt</a> shows</p>
<pre><code>[2021-12-07T08:35:39.375289+01:00] [debug] activate_console, console: root-ssh, type: ssh
[2021-12-07T08:35:39.375987+01:00] [debug] tests/jeos/prepare_firstboot.pm:32 called testapi::select_console -> lib/susedistribution.pm:807 called susedistribution::handle_password_prompt -> lib/susedistribution.pm:48 called testapi::assert_screen
[2021-12-07T08:35:39.376345+01:00] [debug] <<< testapi::assert_screen(mustmatch="password-prompt", timeout=60)
terminate called after throwing an instance of 'cv::Exception'
what(): OpenCV(4.5.4) /home/abuild/rpmbuild/BUILD/opencv-4.5.4/modules/imgproc/src/smooth.dispatch.cpp:293: error: (-215:Assertion failed) ksize.width > 0 && ksize.width % 2 == 1 && ksize.height > 0 && ksize.height % 2 == 1 in function 'createGaussianKernels'
Unexpected end of data 0
X connection to :38581 broken (explicit kill or server shutdown).
[2021-12-07T08:35:39.418186+01:00] [debug] backend process exited: 0
</code></pre>
<p>os-autoinst version is 4.6.1638289529.0a3f5b98 [interface v24]. This is running on siodtw01:4 (Linux 5.15.5-1-default #1 SMP Thu Nov 25 09:36:40 UTC 2021 (83fc974) aarch64)</p>
openQA Project - action #81899 (Resolved): Move code from isotovideo to a module size:Mhttps://progress.opensuse.org/issues/818992021-01-08T12:59:43Ztinitatina.mueller+trick-redmine@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Having code in a module (ideally under <code>lib/</code>) and seperated into functions makes it easier to test and mock.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Move code out of <a href="https://github.com/os-autoinst/os-autoinst/blob/master/isotovideo" class="external">isotovideo</a>, no files/cmake involved here</li>
</ul>
openQA Project - action #78019 (Rejected): [sporadic] os-autoinst t/18-backend-qemu.t timed out i...https://progress.opensuse.org/issues/780192020-11-16T13:30:30Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://build.opensuse.org/package/live_build_log/devel:openQA:TestGithub:OPR-1567/os-autoinst/openSUSE_Factory/x86_64" class="external">https://build.opensuse.org/package/live_build_log/devel:openQA:TestGithub:OPR-1567/os-autoinst/openSUSE_Factory/x86_64</a> shows</p>
<pre><code>[ 191s] 3: ./12-bmwqemu.t ........................... ok
[ 191s] 3: ./15-logging.t ........................... ok
[ 191s] 3: ./16-send_with_fd.t ...................... ok
[ 191s] 3: ./17-basetest.t .......................... ok
[ 191s] 3: Bailout called. Further testing stopped: test exceeds runtime limit of '10' seconds
[ 191s] 3: FAILED--Further testing stopped: test exceeds runtime limit of '10' seconds
[ 191s] 3/3 Test #3: test-perl-testsuite ..............***Failed 109.48 sec
[ 191s]
</code></pre>
<p>this likely means that the <em>next</em> test module which is not explicitly mentioned here times out. That would be according to the alphabetical order "t/18-backend-qemu.t".</p>
<p>Locally I ran: With <code>count_fail_ratio prove -v -I . -I external/os-autoinst-common/lib -l --timer t/18-backend-qemu.t</code> using <a href="https://github.com/okurz/scripts/tree/master/count_fail_ratio" class="external">https://github.com/okurz/scripts/tree/master/count_fail_ratio</a> I observe a very consistent runtime of 1s.</p>
<p>Running tests on caa97e71 checked out I can reproduce the same times. So if a regression it could be in dependencies.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Stable test locally, in travis CI and OBS</li>
<li><strong>AC2:</strong> No significant increase in runtime</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>If not feasible to fix fast please at least prevent the flaky test result, e.g. by bumping the timeout in the test module and reference this ticket</li>
<li>As old travis CI logs do not give us any indication for the runtime of individual test modules I suggest same as we have for openQA to introduce an environment variable with which we can add the prove option <code>--timer</code> and call that within CI but not by default locally</li>
<li>Crosscheck locally, compare to old results</li>
<li>If necessary bump up the timeout but ensure that we do not have a performance regression that we addressed in caa97e71</li>
</ul>
<a name="Workaround"></a>
<h2 >Workaround<a href="#Workaround" class="wiki-anchor">¶</a></h2>
<p>Retrigger test</p>
openQA Project - action #75232 (Resolved): error message when worker has no network (yet): Unable...https://progress.opensuse.org/issues/752322020-10-24T11:24:09Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Currently openqaworker8 has problems to bring the network up due to <a class="issue tracker-4 status-3 priority-6 priority-high2 closed behind-schedule" title="action: OSD partially unresponsive, triggering 500 responses, spotty response visible in monitoring panel... (Resolved)" href="https://progress.opensuse.org/issues/73633">#73633</a> , all workers seem to gracefully handle the slow startup but there is an error (disguised as debug message), from <code>journalctl -b -u openqa-worker@1</code>:</p>
<pre><code>-- Logs begin at Wed 2018-03-07 16:47:21 CET, end at Sat 2020-10-24 13:11:47 CEST. --
Oct 24 13:09:14 linux-fwcx systemd[1]: Starting openQA Worker #1...
Oct 24 13:09:15 linux-fwcx systemd[1]: Started openQA Worker #1.
Oct 24 13:09:16 linux-fwcx worker[3296]: [2020-10-24T13:09:16.359 CEST] [debug] Unable to serialize fatal error: Can't open file "base_state.json": Permission denied at /usr/lib/os-autoinst/bmwqemu.pm line 86.
Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] worker 1:
Oct 24 13:09:16 linux-fwcx worker[3296]: - config file: /etc/openqa/workers.ini
Oct 24 13:09:16 linux-fwcx worker[3296]: - worker hostname: linux-fwcx
Oct 24 13:09:16 linux-fwcx worker[3296]: - isotovideo version: 20
Oct 24 13:09:16 linux-fwcx worker[3296]: - websocket API version: 1
Oct 24 13:09:16 linux-fwcx worker[3296]: - web UI hosts: openqa.suse.de
Oct 24 13:09:16 linux-fwcx worker[3296]: - class: caasp_x86_64,tap,qemu_x86_64,openqaworker8
Oct 24 13:09:16 linux-fwcx worker[3296]: - no cleanup: no
Oct 24 13:09:16 linux-fwcx worker[3296]: - pool directory: /var/lib/openqa/pool/1
Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] CACHE: caching is enabled, setting up /var/lib/openqa/cache/openqa.suse.de
Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] Project dir for host openqa.suse.de is /var/lib/openqa/share
Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] Registering with openQA openqa.suse.de
Oct 24 13:09:16 linux-fwcx worker[3296]: [warn] [pid:3296] Failed to register at openqa.suse.de - connection error: Can't connect: Name or service not known - trying again in 10 seconds
Oct 24 13:09:26 openqaworker8 worker[3296]: [info] [pid:3296] Registering with openQA openqa.suse.de
Oct 24 13:09:26 openqaworker8 worker[3296]: [info] [pid:3296] Establishing ws connection via ws://openqa.suse.de/api/v1/ws/1358
Oct 24 13:09:26 openqaworker8 worker[3296]: [info] [pid:3296] Registered and connected via websockets with openQA host openqa.suse.de and worker ID 1358
</code></pre>
<p>see the message "Unable to serialize fatal error: Can't open file "base_state.json": Permission denied at /usr/lib/os-autoinst/bmwqemu.pm line 86."</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> No error message about the problem to open the file on startup, e.g. when there is no active network connection (yet)</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<p>Either prevent the error condition after analyzing the code or first try to reproduce the issue, e.g. in an environment with simulated broken network connection, e.g. using the tool "unshare" as we do in some of our tests or a clean container environment where this can be simulated. It might be that this has nothing to do with "network" but just startup of a worker.</p>
<a name="Workaround"></a>
<h2 >Workaround<a href="#Workaround" class="wiki-anchor">¶</a></h2>
<p>Error message can be ignored</p>
openQA Project - action #70615 (New): Calling select_serial_terminal() twice on s390x svirt backe...https://progress.opensuse.org/issues/706152020-08-27T15:23:01ZMDouchamartin.doucha@suse.com
<p>When the same job calls <code>select_serial_terminal()</code> twice on s390x svirt worker (e.g. once before and once after reboot), the test will crash with the following error:</p>
<pre><code># wait_serial expected: qr/login:\s*$/ui
# Result:
Script started, file is /tmp/serial_terminal.txt.DjErjAe114GKpV_a
Connected to domain openQA-SUT-3
Escape character is ^]
error: operation failed: Active console session exists for this domain
CONSOLE_EXIT_DjErjAe114GKpV_a: 1
Script done, file is /tmp/serial_terminal.txt.DjErjAe114GKpV_a
</code></pre>
<hr>
<pre><code># Test died: Failed to wait for login prompt at /var/lib/openqa/cache/openqa.suse.de/tests/sle/lib/serial_terminal.pm line 113.
</code></pre>
<p><a href="https://openqa.suse.de/tests/4600947#step/install_klp_product/32" class="external">https://openqa.suse.de/tests/4600947#step/install_klp_product/32</a></p>
<p>Calling <code>select_serial_terminal()</code> multiple times works fine on other archs.</p>
openQA Project - action #69700 (New): Predefined QEMU hardware profiles in os-autoinsthttps://progress.opensuse.org/issues/697002020-08-07T10:15:47ZMDouchamartin.doucha@suse.com
<p>We've recently had a <a href="https://bugzilla.suse.com/show_bug.cgi?id=1174887" class="external">regression</a> that made our kernels unbootable in QEMU VMs created in virt-manager. The regression was missed by OpenQA tests because os-autoinst uses VMs with minimal hardware configuration which didn't trigger the bug.</p>
<p>We should define multiple QEMU hardware profiles (named sets of extra device options for QEMU) which can then be selected through job settings. The hardware profiles don't need to cover every possible combination of devices, it'll be enough if each device model appears in them at least once. One of the profiles should be as close to virt-manager defaults as possible. Then it'll be sufficient to boot the existing LTP jobs on different hardware profiles. We don't need any extra tests beyond checking that the kernel is bootable.</p>
<p>Example profile that would trigger the regression:</p>
<pre><code>-machine pc-q35-4.2,accel=kvm,usb=off,vmport=off,dump-guest-core=off
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1
-device pcie-pci-bridge,id=pci.9,bus=pci.2,addr=0x0
</code></pre> openQA Project - action #69691 (Workable): Improve incomplete output for qemu related problems, e...https://progress.opensuse.org/issues/696912020-08-07T09:06:05Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://openqa.opensuse.org/tests/1355023">https://openqa.opensuse.org/tests/1355023</a> shows reason "backend died: can't open qmp at /usr/lib/os-autoinst/OpenQA/Qemu/Proc.pm line 448." so one needs to take a look into the logfile which shows:</p>
<pre><code>�[0m�[37m[2020-08-06T12:11:40.169 UTC] [debug] starting: /usr/bin/qemu-system-ppc64 -g 1024x768 -vga std -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -soundhw hda -global isa-fdc.driveA= -m 4096 -machine usb=off -cpu host -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -boot order=c -device nec-usb-xhci -device usb-tablet -device usb-kbd -smp 4 -enable-kvm -no-shutdown -vnc :93,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay2-file,filename=/var/lib/openqa/pool/3/raid/hd0-overlay2,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay2,file=hd0-overlay2-file,cache.no-flush=on -device virtio-blk,id=hd0-device,drive=hd0-overlay2,bootindex=0,serial=hd0 -blockdev driver=file,node-name=cd0-overlay2-file,filename=/var/lib/openqa/pool/3/raid/cd0-overlay2,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay2,file=cd0-overlay2-file,cache.no-flush=on -device scsi-cd,id=cd0-device,drive=cd0-overlay2,serial=cd0 -incoming defer
�[0m�[37m[2020-08-06T12:11:40.174 UTC] [debug] Waiting for 0 attempts
…
�[0m�[37m[2020-08-06T12:11:58.443 UTC] [debug] Waiting for 19 attempts
�[0m�[37m[2020-08-06T12:11:59.444 UTC] [debug] Backend process died, backend errors are reported below in the following lines:
can't open qmp at /usr/lib/os-autoinst/OpenQA/Qemu/Proc.pm line 448.
�[0m�[33m[2020-08-06T12:11:59.444 UTC] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
�[0m�[37m[2020-08-06T12:11:59.445 UTC] [debug] flushing frames
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: QEMU emulator version 3.1.1.1 (openSUSE Leap 15.1)
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Unknown host!
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Unknown host!
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Unknown host!
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Unknown host!
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Unknown host!
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: Unknown host!
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] QEMU: qemu-system-ppc64: Failed to allocate KVM HPT of order 25 (try smaller maxmem?): Cannot allocate memory
�[0m�[37m[2020-08-06T12:11:59.447 UTC] [debug] sending magic and exit
�[0m�[37m[2020-08-06T12:11:59.448 UTC] [debug] received magic close
�[0m�[37m[2020-08-06T12:11:59.449 UTC] [debug] THERE IS NOTHING TO READ 15 4 3
</code></pre>
<p>where the latter is only visible in the log file.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> The reason includes the content from the last qemu output</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>For qemu related problems try to parse the (last) line that starts with "QEMU: " and put that into the reason instead of "can't open qmp"</li>
</ul>
openQA Project - action #67429 (New): Raw text console capabilityhttps://progress.opensuse.org/issues/674292020-05-28T16:28:29Zlivdywanliv.dywan@suse.com
<ul>
<li>Replace <code>is_serial_terminal</code> in favor of <code>direct_write_text</code>.</li>
</ul>
<p>This is the console analogue of <a class="issue tracker-4 status-1 priority-3 priority-lowest child" title="action: Raw text backend capability (New)" href="https://progress.opensuse.org/issues/67426">#67426</a>.</p>
openQA Project - action #67426 (New): Raw text backend capabilityhttps://progress.opensuse.org/issues/674262020-05-28T16:26:07Zlivdywanliv.dywan@suse.com
<ul>
<li>Move <code>select_serial_terminal</code> logic into each backend subclass</li>
</ul>
openQA Project - action #67423 (New): Persistent console backend capabilityhttps://progress.opensuse.org/issues/674232020-05-28T16:24:16Zlivdywanliv.dywan@suse.com
<ul>
<li>Replace <code>s390x</code> and <code>is_pvm</code> checks which guard ssh console/ reboot availability (via <code>keepconsole => 1</code>).</li>
<li>Remove redundant checks for the backend in favor of relying on the backend itself.</li>
</ul>
<p>This is the backend capability analoguous to <a class="issue tracker-4 status-1 priority-3 priority-lowest child" title="action: Persistent console console capability (New)" href="https://progress.opensuse.org/issues/67420">#67420</a>.</p>