openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842024-03-15T10:13:41ZopenSUSE Project Management Tool
Redmine openQA Project - action #157333 (Closed): Log all job setting changes in autoinst-log.txthttps://progress.opensuse.org/issues/1573332024-03-15T10:13:41ZMDouchamartin.doucha@suse.com
<p>All job settings should be logged in autoinst-log.txt with source of the value (e.g. the place where <code>set_var()</code> was called or whether they were added from product/medium/worker etc.)</p>
QA - action #123748 (Resolved): [tools] Add support for excluding packages from test flavor in bo...https://progress.opensuse.org/issues/1237482023-01-27T12:53:19ZMDouchamartin.doucha@suse.com
<p>SLE-15SP4 livepatching channel will include packages for userspace livepatching which need standard single incident and aggregate tests. Incident scheduling logic in bot config therefore needs support for package exclusion so that the livepatching channel can be enabled for single incidents without flooding the job groups with kernel livepatch tests. Example:</p>
<pre><code>Server-DVD-Incidents:
archs:
- x86_64
issues:
...
exclude_packages:
- kernel-livepatch
</code></pre>
<p>Any incident that contains package with the given name (or name prefix) will be skipped for the parent flavor regardless of what else it contains.</p>
openQA Project - action #121774 (In Progress): LTP cgroup test appears to crash OpenQA worker ins...https://progress.opensuse.org/issues/1217742022-12-09T13:36:47ZMDouchamartin.doucha@suse.com
<p>LTP test cgroup_fj_stress_blkio_4_4_each on latest SLE-15SP1 KOTD kernel appears to crash the OpenQA worker instance it's running on. The test itself will succeed but the OpenQA job will stay stuck in <code>wait_serial()</code> for several hours (despite 90 second timeout) until the whole job fails on MAX_JOB_TIME. There are 3 examples so far:<br>
<a href="https://openqa.suse.de/tests/10089424#step/cgroup_fj_stress_blkio_4_4_each/7" class="external">https://openqa.suse.de/tests/10089424#step/cgroup_fj_stress_blkio_4_4_each/7</a><br>
<a href="https://openqa.suse.de/tests/10111009#step/cgroup_fj_stress_blkio_4_4_each/7" class="external">https://openqa.suse.de/tests/10111009#step/cgroup_fj_stress_blkio_4_4_each/7</a><br>
<a href="https://openqa.suse.de/tests/10113099#step/cgroup_fj_stress_blkio_4_4_each/7" class="external">https://openqa.suse.de/tests/10113099#step/cgroup_fj_stress_blkio_4_4_each/7</a></p>
<p>I've seen this issue only on SLE-15SP1 KOTD builds 156 and 157. I have not seen any cases on other SLE versions.</p>
<p>Typical autoinst-log.txt entries related to the timeout:</p>
<pre><code>[2022-12-06T08:52:27.432374+01:00] [debug] <<< testapi::script_run(cmd="vmstat -w", output="", quiet=undef, timeout=30, die_on_timeout=1)
[2022-12-06T08:52:27.432549+01:00] [debug] tests/kernel/run_ltp.pm:334 called testapi::script_run
[2022-12-06T08:52:27.432710+01:00] [debug] <<< testapi::wait_serial(record_output=undef, regexp="# ", quiet=undef, no_regex=1, buffer_size=undef, expect_not_found=0, timeout=90)
[2022-12-06T10:39:58.278597+01:00] [debug] autotest received signal TERM, saving results of current test before exiting
[2022-12-06T10:39:58.278622+01:00] [debug] isotovideo received signal TERM
[2022-12-06T10:39:58.278748+01:00] [debug] backend got TERM
</code></pre>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/10091628" class="external">4.12.14-150100.156.1.gb6c27ee</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-Incidents-Kernel-KOTD&machine=64bit&test=ltp_controllers&version=15-SP1" class="external">latest</a></p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce:<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<ol>
<li>Run <code>ltp_controllers</code> testsuite on SLE-15SP1 KOTD</li>
<li>Wait.</li>
</ol>
openQA Tests - action #116287 (Rejected): [qe-core][s390x] SSH serial terminal connection issues ...https://progress.opensuse.org/issues/1162872022-09-06T13:54:08ZMDouchamartin.doucha@suse.com
<p>s390x livepatch tests had a lot of installation failures this month due to SSH serial terminal connection failures. Interestingly enough, the connection failures seem to happen around the same module step. serial_terminal.txt output appears to be out of sync with the terminal because part of the commands and output is missing even though it's listed in the update_kernel module details. The dmesg output in serial0.txt often (but not always) shows some key exchange SSH error followed by output from a completely different job:</p>
<pre><code>Welcome to SUSE Linux Enterprise Server 15 SP2 (s390x) - Kernel 5.3.18-24.83-default (ttysclp0).
eth0: 10.161.145.86 fe80::5054:ff:fe84:f877
susetest login: root
Password:
Last login: Mon Sep 5 10:18:10 from 10.160.0.147
susetest:~ #�(B systemctl is-active network
active
susetest:~ #�(B systemctl is-active sshd
active
susetest:~ #�(B 2022-09-05T10:25:03.604370-04:00 susetest sshd[4272]: error: kex_exchange_identification: Connection closed by remote host
2022-09-05T10:25:04.844743-04:00 susetest sshd[4273]: error: kex_exchange_identification: Connection closed by remote host
[ 107.444474] LTP: starting DI000 (dirty)
[ 107.445525] LTP: starting DS000 (dio_sparse)
[ 107.466125] LTP: starting abort01
[ 107.758318] LTP: starting accept01
</code></pre>
<p>12-SP4: <a href="https://openqa.suse.de/tests/9438804#step/update_kernel/337" class="external">https://openqa.suse.de/tests/9438804#step/update_kernel/337</a><br>
15-SP2: <a href="https://openqa.suse.de/tests/9457752#step/update_kernel/337" class="external">https://openqa.suse.de/tests/9457752#step/update_kernel/337</a><br>
15-SP3: <a href="https://openqa.suse.de/tests/9458645#step/update_kernel/337" class="external">https://openqa.suse.de/tests/9458645#step/update_kernel/337</a><br>
15-SP4: <a href="https://openqa.suse.de/tests/9455666#step/update_kernel/199" class="external">https://openqa.suse.de/tests/9455666#step/update_kernel/199</a></p>
<p>I could not find any such connection failure on SLE-12SP5. Other SLE releases don't support s390x livepatches and KOTD tests don't show this kind of issue. This looks like a kernel bug but I'd like an s390x expert to look at this before I create a Bugzilla ticket. And of course this has exposed logging issues in OpenQA.</p>
openQA Project - action #107701 (Resolved): [osd] Job detail page fails to loadhttps://progress.opensuse.org/issues/1077012022-02-28T14:34:19ZMDouchamartin.doucha@suse.com
<p>The job detail page for the following ltp_syscalls_secureboot job is timing out:<br>
<a href="https://openqa.suse.de/tests/8232404" class="external">https://openqa.suse.de/tests/8232404</a></p>
<p>Please investigate why and fix it if possible.</p>
openQA Project - action #106898 (Resolved): Protection against asset clobberinghttps://progress.opensuse.org/issues/1068982022-02-16T10:33:47ZMDouchamartin.doucha@suse.com
<p>QCOW images in OpenQA occasionally get corrupted because multiple jobs try to publish the same file at the same time, either due to <code>PUBLISH_*</code> setting misconfiguration or duplicate install jobs scheduled in parallel. For example, this job failed to start:<br>
<a href="https://openqa.suse.de/tests/8162749" class="external">https://openqa.suse.de/tests/8162749</a></p>
<p>because these three install jobs finished 20 minutes apart and tried to upload the same QCOW image:<br>
<a href="https://openqa.suse.de/tests/8162347" class="external">https://openqa.suse.de/tests/8162347</a><br>
<a href="https://openqa.suse.de/tests/8161501" class="external">https://openqa.suse.de/tests/8161501</a><br>
<a href="https://openqa.suse.de/tests/8160547" class="external">https://openqa.suse.de/tests/8160547</a></p>
<p>Please add some sort of protection against asset clobbering via <code>PUBLISH_*</code> variables:</p>
<ul>
<li>two jobs must not publish the same file in parallel</li>
<li>jobs must not publish a file while another job may be downloading the previous version</li>
<li><code>PUBLISH_*</code> misconfiguration (e.g. copy-paste mistakes among multiple testsuites) should be detected and reported in the WebUI, for example as the reason why install job was terminated</li>
</ul>
openQA Project - action #98841 (Resolved): qemu randomly fails to start on QA-Power8-5-kvm auto_r...https://progress.opensuse.org/issues/988412021-09-17T15:49:43ZMDouchamartin.doucha@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>A few LTP jobs have failed to start today due to qemu error on QA-Power8-5-kvm worker:<br>
<a href="https://openqa.suse.de/tests/7138972">https://openqa.suse.de/tests/7138972</a><br>
<a href="https://openqa.suse.de/tests/7149857">https://openqa.suse.de/tests/7149857</a><br>
<a href="https://openqa.suse.de/tests/7153989">https://openqa.suse.de/tests/7153989</a></p>
<p>All of them have the following output in autoinst-log.txt:</p>
<pre><code>[2021-09-17T16:24:29.803 CEST] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
QEMU terminated before QMP connection could be established. Check for errors below
[2021-09-17T16:24:29.804 CEST] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2021-09-17T16:24:29.805 CEST] [debug] Passing remaining frames to the video encoder
[2021-09-17T16:24:29.805 CEST] [debug] Waiting for video encoder to finalize the video
[2021-09-17T16:24:29.805 CEST] [debug] The built-in video encoder (pid 110385) terminated
[2021-09-17T16:24:29.807 CEST] [debug] QEMU: QEMU emulator version 4.2.1 (openSUSE Leap 15.2)
[2021-09-17T16:24:29.807 CEST] [debug] QEMU: Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
[2021-09-17T16:24:29.807 CEST] [warn] !!! : qemu-system-ppc64: Failed to allocate KVM HPT of order 25 (try smaller maxmem?): Cannot allocate memory
</code></pre>
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>QA-Power8-5-kvm has 256GB RAM. <a href="https://monitor.qa.suse.de/d/WDQA-Power8-5-kvm/worker-dashboard-qa-power8-5-kvm?viewPanel=12054&orgId=1&from=1631765162464&to=1632085860553">https://monitor.qa.suse.de/d/WDQA-Power8-5-kvm/worker-dashboard-qa-power8-5-kvm?viewPanel=12054&orgId=1&from=1631765162464&to=1632085860553</a> shows that some memory was used during the period when the test failed but nothing that should explain the inability to allocate the memory for the qemu VM. In the system journal there is</p>
<pre><code>Sep 17 16:24:28 QA-Power8-5-kvm worker[88148]: [debug] [pid:88148] REST-API call: POST http://openqa.suse.de/api/v1/jobs/7125263/status
Sep 17 16:24:29 QA-Power8-5-kvm worker[104911]: [info] [pid:108741] sle-15-SP4-ppc64le-Build36.1-HA-BV.qcow2: Processing chunk 501/5812, avg. speed ~976.562 KiB/s
Sep 17 16:24:29 QA-Power8-5-kvm worker[101413]: [debug] [pid:102598] Uploading artefact mq_timedreceive_15-1-2.txt
Sep 17 16:24:29 QA-Power8-5-kvm worker[96737]: [debug] [pid:96737] REST-API call: POST http://openqa.suse.de/api/v1/jobs/7125265/status
Sep 17 16:24:29 QA-Power8-5-kvm worker[109458]: [debug] [pid:110336] Uploading artefact bootloader_start-15.txt
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: 23 callbacks suppressed
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf00, 3cb100) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf04, 3cb104) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf08, 3cb108) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf0c, 3cb10c) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf10, 3cb110) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf14, 3cb114) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf18, 3cb118) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf1c, 3cb11c) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf20, 3cb120) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf24, 3cb124) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: cma: cma_alloc: alloc failed, req-size: 512 pages, ret: -16
Sep 17 16:24:30 QA-Power8-5-kvm worker[88148]: [debug] [pid:88148] Upload concluded (at wait_children)
Sep 17 16:24:30 QA-Power8-5-kvm worker[109557]: [info] [pid:109557] Isotovideo exit status: 1
Sep 17 16:24:30 QA-Power8-5-kvm worker[109557]: [debug] [pid:109557] Stopping job 7153989 from openqa.suse.de: 07153989-sle-15-SP3-Server-DVD-Incidents-Kernel-KOTD-ppc64le-Build5.3.18-302.1.g316993b-ltp_crashme@ppc64le-virtio - reason: died
Sep 17 16:24:30 QA-Power8-5-kvm worker[109557]: [debug] [pid:109557] REST-API call: POST http://openqa.suse.de/api/v1/jobs/7153989/status
Sep 17 16:24:30 QA-Power8-5-kvm worker[101413]: [debug] [pid:102598] Uploading artefact mq_timedreceive_7-1-2.txt
</code></pre>
<p>in particular the messages</p>
<pre><code>Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf20, 3cb120) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: alloc_contig_range: [3caf24, 3cb124) PFNs busy
Sep 17 16:24:29 QA-Power8-5-kvm kernel: cma: cma_alloc: alloc failed, req-size: 512 pages, ret: -16
</code></pre>
<p>so an allocation failure. We could report a bug about this but because KVM on SUSE with Power8 is unsupported so I don't expect any success.</p>
<p>We likely need to accept such issues and trigger a restart automatically by openQA.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> qemu ppc64le allocation errors cause automatic job retriggers by openQA</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Catch the error and make it "Incomplete"</li>
<li>Restart the incomplete job</li>
<li>Make openQA automatically detect the issue and trigger restart, e.g. based on <a href="https://github.com/os-autoinst/openQA/blob/master/etc/openqa/openqa.ini#L76">https://github.com/os-autoinst/openQA/blob/master/etc/openqa/openqa.ini#L76</a></li>
</ul>
openQA Project - action #96007 (Resolved): OpenQA jobs randomly time out during setup phasehttps://progress.opensuse.org/issues/960072021-07-26T10:23:59ZMDouchamartin.doucha@suse.com
<p>OpenQA jobs have been incompleting more than usual in the past few weeks. The incompletes I've seen just today all show the following sequence of messages in worker.log:</p>
<pre><code>[2021-07-24T16:28:45.444 CEST] [debug] started mgmt loop with pid 59928
[2021-07-24T16:28:45.510 CEST] [debug] qemu version detected: 4.2.1
[2021-07-24T16:28:45.512 CEST] [debug] running /usr/bin/chattr -f +C /var/lib/openqa/pool/9/raid
[2021-07-24T18:28:42.557 CEST] [debug] isotovideo received signal TERM
[2021-07-24T18:28:42.558 CEST] [debug] backend got TERM
</code></pre>
<p><a href="https://openqa.suse.de/tests/6552459" class="external">https://openqa.suse.de/tests/6552459</a><br>
<a href="https://openqa.suse.de/tests/6555414" class="external">https://openqa.suse.de/tests/6555414</a><br>
<a href="https://openqa.suse.de/tests/6543695" class="external">https://openqa.suse.de/tests/6543695</a></p>
<p>I'll update this ticket if I find any similar jobs where the last operation before timeout isn't <code>chattr -f +C</code>.</p>
openQA Project - action #94850 (Resolved): QEMU 6.0 fails to start if job has QEMU_NUMA=1https://progress.opensuse.org/issues/948502021-06-29T10:05:52ZMDouchamartin.doucha@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Originally reported by pvorel relating to <a href="http://quasar.suse.cz/tests/6860#">http://quasar.suse.cz/tests/6860#</a></p>
<p>It appears that QEMU 6.0 requires explicit memory assignment to NUMA nodes using <code>-numa node,mem=...</code> command line arguments. OpenQA jobs with <code>QEMU_NUMA=1</code> currently fail to start on Tumbleweed because os-autoinst omits the <code>mem</code> option, which leads to the following error:</p>
<pre><code>[2021-06-28T13:26:00.308 CEST] [debug] starting: /usr/bin/qemu-system-x86_64 -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -audiodev none,id=snd0 -device intel-hda -device hda-output,audiodev=snd0 -global isa-fdc.fdtypeA=none -m 2048 -cpu qemu64 -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -boot order=c -device usb-ehci -device usb-tablet -smp 2 -numa node,nodeid=0 -numa node,nodeid=1 -enable-kvm -no-shutdown -vnc :91,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/var/lib/openqa/pool/1/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on -device virtio-blk,id=hd0-device,drive=hd0-overlay0,bootindex=0,serial=hd0
[2021-06-28T13:26:00.312 CEST] [debug] Waiting for 0 attempts
[2021-06-28T13:26:00.423 CEST] [debug] Waiting for 1 attempts
[2021-06-28T13:26:00.424 CEST] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
QEMU terminated before QMP connection could be established at /usr/lib/os-autoinst/OpenQA/Qemu/Proc.pm line 453.
[2021-06-28T13:26:00.424 CEST] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2021-06-28T13:26:00.424 CEST] [debug] Passing remaining frames to the video encoder
[2021-06-28T13:26:00.426 CEST] [debug] Waiting for video encoder to finalize the video
[2021-06-28T13:26:00.426 CEST] [debug] The built-in video encoder (pid 22047) terminated
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: QEMU emulator version 6.0.0 (openSUSE Tumbleweed)
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: qemu-system-x86_64: -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on: warning: short-form boolean option 'server' deprecated
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: Please use server=on instead
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: qemu-system-x86_64: -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on: warning: short-form boolean option 'nowait' deprecated
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: Please use wait=off instead
[2021-06-28T13:26:00.427 CEST] [debug] QEMU: qemu-system-x86_64: total memory for NUMA nodes (0x0) should equal RAM size (0x80000000)
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> os-autoinst continues to start for qemu >= 6.0 and QEMU_NUMA=1</li>
<li><strong>AC2:</strong> os-autoinst jobs still work fine on qemu < 6.0 regardless of QEMU_NUMA setting</li>
</ul>
openQA Tests - action #93112 (Resolved): [qe-core][s390x] bootloader_zkvm fails: Cannot allocate ...https://progress.opensuse.org/issues/931122021-05-25T15:32:22ZMDouchamartin.doucha@suse.com
<p>s390 jobs randomly fail in <code>bootloader_zkvm</code>. autoinst-log.txt shows the following error:</p>
<pre><code>[debug] [run_ssh_cmd(virsh start openQA-SUT-4 2> >(tee /tmp/os-autoinst-openQA-SUT-4-stderr.log >&2))] stderr:
error: Failed to start domain openQA-SUT-4
error: internal error: qemu unexpectedly closed the monitor: 2021-05-18T11:23:21.183643Z qemu-system-s390x: cannot set up guest memory 's390.ram': Cannot allocate memory
</code></pre>
<p><a href="https://openqa.suse.de/tests/6044126#step/bootloader_zkvm/28" class="external">https://openqa.suse.de/tests/6044126#step/bootloader_zkvm/28</a><br>
<a href="https://openqa.suse.de/tests/6044006#step/bootloader_zkvm/28" class="external">https://openqa.suse.de/tests/6044006#step/bootloader_zkvm/28</a></p>
<p>This appears to be the same problem as <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: [sle][functional][u][s390x[kvm] test fails in bootloader_zkvm - "Cannot allocate memory" when ins... (Resolved)" href="https://progress.opensuse.org/issues/45326">#45326</a> and <a class="issue tracker-4 status-6 priority-4 priority-default closed" title="action: [functional][u] test fails in bootloader_zkvm - qemu-system-s390x: cannot set up guest memory 's3... (Rejected)" href="https://progress.opensuse.org/issues/48404">#48404</a>.</p>
<p>Additional links: <a href="https://openqa.suse.de/tests/latest?arch=s390x&distri=sle&flavor=Server-DVD-Incidents-Kernel&machine=s390x-kvm-sle12&test=install_ltp%2Bsle%2BServer-DVD-Incidents-Kernel&version=15-SP2" class="external">latest job with bootloader_zkvm</a></p>
openQA Project - action #88193 (Resolved): [qe-core] virtio-terminal is missing for non root usershttps://progress.opensuse.org/issues/881932021-01-25T14:25:47ZMDouchamartin.doucha@suse.com
<p>Calling <code>$self->select_user_serial_terminal;</code> (alias for <code>$self->select_serial_terminal(0);</code>) in test on a QEMU backend results in the following error:</p>
<pre><code>[2021-01-25T14:06:53.271 CET] [debug] tests/x11/ghostscript.pm:45 called opensusebasetest::select_serial_terminal -> lib/opensusebasetest.pm:1243 called testapi::select_console
[2021-01-25T14:06:53.271 CET] [debug] <<< testapi::select_console(testapi_console="virtio-terminal")
console virtio-terminal does not exist at /usr/lib/os-autoinst/backend/driver.pm line 86.
[2021-01-25T14:06:53.319 CET] [info] ::: basetest::runtest: # Test died: Can't call method "select" on an undefined value at /usr/lib/os-autoinst/backend/baseclass.pm line 667.
</code></pre>
<p>The <code>select_serial_terminal</code> method expects to have a non-root virtio console named <code>virtio-terminal</code> but <code>lib/susedistribution.pm</code> does not define any non-root virtio consoles.</p>
openQA Tests - action #64285 (New): [qe-core][qem] Aggregate tests with GM base imagehttps://progress.opensuse.org/issues/642852020-03-06T16:39:37ZMDouchamartin.doucha@suse.com
<p>This is a test scenario designed to detect weak dependency breakage which caused certificate issues on SLE-12. <a href="https://bugzilla.suse.com/show_bug.cgi?id=1165915" class="external">https://bugzilla.suse.com/show_bug.cgi?id=1165915</a></p>
<p>Scenario:</p>
<ol>
<li>Start with GM base image of target SLE (only packages from GM pool)</li>
<li>Collect package names from incident repos</li>
<li>Install corresponding packages from GM pool repos</li>
<li>Enable both update repos <strong>AND</strong> incident repos</li>
<li>Do full system update</li>
<li>Run package-specific tests</li>
</ol>
<p>If you don't install old packages from GM pool first, zypper will order packages correctly through transitive dependencies. We're specifically trying to break transitive dependencies here.</p>
<p>If you separate system update from incident installation (splitting step 4), you may accidentally force correct ordering of transitive dependencies through release timing. In that case, dependency bugs will show up only if the packages with broken weak dependency both end up in testing queue at the same time (not guaranteed), of after both have been released (oh sh*t).</p>
openQA Tests - action #60176 (Resolved): [kernel][s390x] tests look for login prompt just after t...https://progress.opensuse.org/issues/601762019-11-22T12:34:11ZMDouchamartin.doucha@suse.com
<p>Since 2019-11-20 around 09:50, all LTP install jobs running on grenache/s390-kvm-sle12 are timing out while waiting for login prompt.<br>
SLE-12SP2: <a href="https://openqa.suse.de/tests/3610367#step/install_ltp/23" class="external">https://openqa.suse.de/tests/3610367#step/install_ltp/23</a><br>
SLE-12SP4: <a href="https://openqa.suse.de/tests/3615783#step/install_ltp/23" class="external">https://openqa.suse.de/tests/3615783#step/install_ltp/23</a><br>
SLE-12SP5: <a href="https://openqa.suse.de/tests/3615915#step/install_ltp/23" class="external">https://openqa.suse.de/tests/3615915#step/install_ltp/23</a></p>
<p>The login prompt appears on serial console shortly after <code>wait_serial</code> times out: <a href="https://openqa.suse.de/tests/3610367#step/install_ltp/27" class="external">https://openqa.suse.de/tests/3610367#step/install_ltp/27</a></p>
<p>SLE-15GA and SLE-15SP1 jobs run fine, most likely because they use zkvm workers.</p>
openQA Tests - action #58601 (Resolved): [qam]test fails in qa_test_klp (kernel source version mi...https://progress.opensuse.org/issues/586012019-10-23T12:53:20ZMDouchamartin.doucha@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-Server-DVD-Incidents-Kernel-ppc64le-kernel-live-patching@ppc64le-virtio fails in<br>
<a href="https://openqa.suse.de/tests/3508305/modules/qa_test_klp/steps/12" class="external">qa_test_klp</a></p>
<p>The VM image was installed for kernel build 1760 but the test job was stuck in queue for too long and a new kernel build became available in the mean time. When the test finally started, the test job installed kernel source for build 1761. The live patch compiler then couldn't find the kernel sources and the job failed.</p>
<p>Solution: Read running kernel version from <code>uname</code> and always install specific version of kernel sources.</p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>qa_test_klp, test of Kernel Livepatching Infrastructure</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3508305" class="external">4.12.14-1760.1.gcb14640</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3504343" class="external">4.12.14-1754.1.g481da9b</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=ppc64le&distri=sle&flavor=Server-DVD-Incidents-Kernel&machine=ppc64le-virtio&test=kernel-live-patching&version=15" class="external">latest</a></p>
openQA Tests - action #57131 (Resolved): install_ltp job fails in update_kernel (12SP4@ppc64le)https://progress.opensuse.org/issues/571312019-09-20T09:00:32ZMDouchamartin.doucha@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP4-Server-DVD-Incidents-Kernel-ppc64le-install_ltp+sle+Server-DVD-Incidents-Kernel@ppc64le-virtio consistently fails in <a href="https://openqa.suse.de/tests/3384280/modules/update_kernel/steps/32" class="external">update_kernel</a> due to DNS error. Zypper almost always fails to resolve IP address of update repository host. The failure happens at different points in the test job (sometimes in module update_kernel, sometimes in module install_ltp) but it's always a DNS resolution error.</p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>install ltp with maintenance kernel/kgraft update</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3342870" class="external">4.12.14-358.1.g6790685</a><br>
Oldest known failure of this type and build branch: <a href="https://openqa.suse.de/tests/3127191" class="external">4.12.14-322.1.g0619c2b</a><br>
Oldest known failure of this type in other 12SP4@ppc64le branches: <a href="https://openqa.suse.de/tests/3064111" class="external">:11846:kernel-ec2</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3330947" class="external">4.12.14-356.1.gff88a5c</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=ppc64le&distri=sle&flavor=Server-DVD-Incidents-Kernel&machine=ppc64le-virtio&test=install_ltp%2Bsle%2BServer-DVD-Incidents-Kernel&version=12-SP4" class="external">latest</a></p>