openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-10-25T13:29:26ZopenSUSE Project Management Tool
Redmine openQA Project - action #101457 (New): Native per-module bug tagshttps://progress.opensuse.org/issues/1014572021-10-25T13:29:26Zrpalethorperichard.palethorpe@suse.com
<a name="Motivation"></a>
<h1 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h1>
<p>We need to tag individual modules (e.g. LTP tests) within a job. Presently we (kernel qa) do this within job comments using syntax like "test123: bug#123". This requires parsing job comments.</p>
<p>Other teams have different solutions, like parsing external YAML files and marking individual modules as soft-failed.</p>
<p>Providing a single structured data source in OpenQA will simplify reporting and bug tag propagation.</p>
<a name="Goal"></a>
<h1 >Goal<a href="#Goal" class="wiki-anchor">¶</a></h1>
<p>Provide simple interface through OpenQA to:</p>
<ul>
<li>assign a bug to a job module</li>
<li>query the bug assigned to a job module</li>
<li>remove a bug from a job module</li>
</ul>
<p>I think a single reference to one bug tracker is sufficient. Related items in other trackers can be handled by one external tracker (e.g. Redmine).</p>
<a name="Non-Goals"></a>
<h1 >Non-Goals<a href="#Non-Goals" class="wiki-anchor">¶</a></h1>
<ul>
<li>Propagate bugs from one build to the next</li>
<li>Notifications or reporting</li>
</ul>
<a name="Alternatives"></a>
<h1 >Alternatives<a href="#Alternatives" class="wiki-anchor">¶</a></h1>
<ul>
<li>External service and database (e.g <a href="https://gitlab.suse.de/kernel-qa/bugtags" class="external">https://gitlab.suse.de/kernel-qa/bugtags</a>)</li>
</ul>
openQA Project - action #40538 (Workable): Reset/Clear guest RAM when it reboots in QEMU to reduc...https://progress.opensuse.org/issues/405382018-09-03T14:22:40Zrpalethorperichard.palethorpe@suse.com
<p>During installation 4GB+ of RAM can be used by the guest. Most of the time the RAM usage is much lower than this.</p>
<p>After installation completes the system is rebooted and then a snapshot is taken. In theory the snapshot should be very small because the system has only just booted, however it appears that QEMU thinks all the RAM is still in use and saves it to the snapshot. This might not be unexpected because on bare metal the RAM is preserved between reboots on modern systems. However, assuming that it is not relied upon by the guest OS, we don't need it to happen and can save some time.</p>
<p>Some ideas to solve this:</p>
<ul>
<li>Use the virtio memory balloon</li>
<li>Use the -no-reboot switch and restart the QEMU process if it exits unexpectedly.</li>
<li>Patch QEMU to clear (some of) the RAM when the guest initiates a reboot.</li>
</ul>
openQA Project - action #40520 (New): SKIPTO fails to load snapshotshttps://progress.opensuse.org/issues/405202018-09-03T10:41:57Zrpalethorperichard.palethorpe@suse.com
<p>There appear to be multiple problems with this feature. In particular when using MAKETESTSNAPSHOTS.</p>
<p>Sometimes loading snapshots works as expected, but others it fails with various different error messages. Some of them from QEMU directly and others from the QEMU backend.</p>
<p>One error from the backend is:<br>
DIE Sequence mismatch while loading 'shutdown-shutdown' snapshot state: 30 != 28 at /home/geekotest/os-autoinst/OpenQA/Qemu/SnapshotConf.pm line 102.</p>
<p>Another from QEMU is:<br>
[2018-09-03T10:33:04.0775 CEST] [debug] QEMU: qemu-system-aarch64: Unknown savevm section or instance '0000:00:06.0/virtio-scsi' 0<br>
[2018-09-03T10:33:04.0775 CEST] [debug] QEMU: qemu-system-aarch64: load of migration failed: Invalid argument</p>
<p>Restarting the same job multiple times with SKIPTO seems to increase the chances of a failure.</p>
openQA Project - action #19174 (Rejected): [aarch64] Timeouts waiting for QEMU HMP socket during ...https://progress.opensuse.org/issues/191742017-05-16T08:05:34Zrpalethorperichard.palethorpe@suse.com
<p>Sometimes aarch64 tests timeout waiting for a response from QEMU over HMP. In particular <a href="https://openqa.suse.de/tests/933880">https://openqa.suse.de/tests/933880</a>.</p>
<pre><code>06:42:01.1674 1294 ||| finished boot_ltp kernel at 2017-05-16 06:42:01 (126 s)
06:42:01.1686 1294 Creating a VM snapshot lastgood
DIE ERROR: timeout reading hmp socket
at /usr/lib/os-autoinst/backend/baseclass.pm line 73.
backend::baseclass::die_handler('ERROR: timeout reading hmp socket\x{a}') called at /usr/lib/os-autoinst/backend/qemu.pm line 923
backend::qemu::_read_hmp('backend::qemu=HASH(0xd22b550)') called at /usr/lib/os-autoinst/backend/qemu.pm line 991
backend::qemu::_send_hmp('backend::qemu=HASH(0xd22b550)', 'savevm lastgood') called at /usr/lib/os-autoinst/backend/qemu.pm line 212
backend::qemu::save_snapshot('backend::qemu=HASH(0xd22b550)', 'HASH(0xd9baf48)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 68
backend::baseclass::handle_command('backend::qemu=HASH(0xd22b550)', 'HASH(0xd9c08b8)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 422
backend::baseclass::check_socket('backend::qemu=HASH(0xd22b550)', 'IO::Handle=GLOB(0xd64c4d8)') called at /usr/lib/os-autoinst/backend/qemu.pm line 1018
backend::qemu::check_socket('backend::qemu=HASH(0xd22b550)', 'IO::Handle=GLOB(0xd64c4d8)', 0) called at /usr/lib/os-autoinst/backend/baseclass.pm line 203
eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 151
backend::baseclass::run_capture_loop('backend::qemu=HASH(0xd22b550)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 122
backend::baseclass::run('backend::qemu=HASH(0xd22b550)', 6, 9) called at /usr/lib/os-autoinst/backend/driver.pm line 85
backend::driver::start('backend::driver=HASH(0xc535e90)') called at /usr/lib/os-autoinst/backend/driver.pm line 48
backend::driver::new('backend::driver', 'qemu') called at /usr/bin/isotovideo line 206
main::init_backend() called at /usr/bin/isotovideo line 271
06:47:01.2664 1296 waitpid for 1302 returned 0
06:47:01.2665 1296 sending TERM to qemu pid: 1302
06:47:02.2668 1296 waitpid for 1302 returned 0
06:47:02.5449 1288 signalhandler got TERM - loop 1
06:47:02.5451 1288 awaiting death of commands process
06:47:02.5505 1288 commands process exited: 1292
06:47:02.5507 1288 awaiting death of testpid 1294
06:47:02.5588 1288 test process exited: 1294
06:47:02.5589 1288 isotovideo failed
</code></pre> openQA Project - action #16616 (Rejected): ppc64le tests die/timeout while saving snapshothttps://progress.opensuse.org/issues/166162017-02-09T11:01:12Zrpalethorperichard.palethorpe@suse.com
<p>In the following case it clearly shows that the test timed out while waiting for a response from QEMU. In other cases it is not clear to me why the test dies, but it seems to happen at the same point (where a snapshot is saved). I thought there would be an existing ticket for this, but could not find it.</p>
<p><a href="https://openqa.suse.de/tests/762741" class="external">https://openqa.suse.de/tests/762741</a></p>
<a name="Hypothesises"></a>
<h2 >Hypothesises<a href="#Hypothesises" class="wiki-anchor">¶</a></h2>
<ul>
<li>H1, It takes too long to save the snapshot and times out, but would complete if given enough time.</li>
<li>H2, QEMU crashes</li>
<li>H3, The storage is unreachable or broken</li>
<li>H4, The socket is misread by os-autoinst</li>
</ul>
<p>H1 seems the most likely by far.</p>
<a name="Potential-Actions"></a>
<h2 >Potential Actions<a href="#Potential-Actions" class="wiki-anchor">¶</a></h2>
<ul>
<li>A1, Increase the timeout</li>
<li>A2, Increase the storage or compression performance</li>
<li>A3, Stress test OpenQA to recreate the bug and investigate further</li>
</ul>
<p>A1 is easiest, A2 and A3 may be more profitable, but maybe too difficult for now.</p>
<a name="Workarounds"></a>
<h2 >Workarounds<a href="#Workarounds" class="wiki-anchor">¶</a></h2>
<p>Simply restart the test manually.</p>
openQA Project - action #16506 (Resolved): [easy hack] Use of uninitialized value with isotovideo...https://progress.opensuse.org/issues/165062017-02-06T14:45:11Zrpalethorperichard.palethorpe@suse.com
<p>run <code>isotovideo --help</code> and observe:</p>
<p><code>Use of uninitialized value $r in concatenation (.) or string at qa/os-autoinst/isotovideo line 537.<br>
23685: EXIT<br>
Use of uninitialized value $? in scalar assignment at qa/os-autoinst/isotovideo line 538.</code></p>
openQA Tests - action #15700 (Rejected): [LTP][OpenQA] ima,tpm: need TPMhttps://progress.opensuse.org/issues/157002016-12-30T10:26:11Zrpalethorperichard.palethorpe@suse.com
<p>These tests require a Trusted Platform Module which is not currently available inside our SUT's VM. At a cursory glance, there are a few options for solving this, including, but probably not limited to:</p>
<ol>
<li>Pass-through the host's TPM module to the guest.</li>
<li>Emulate the TPM using <a href="https://github.com/PeterHuewe/tpm-emulator" class="external">https://github.com/PeterHuewe/tpm-emulator</a> either on the guest or host.</li>
<li>Wait for QEMU TPM device emulation.</li>
</ol>
<p>In the case of option 1 we need to use the Linux vTPM proxy driver to ensure the guest doesn't take exclusive control of the TPM. This requires reconfiguring the host/worker's kernel to build the vtpmx module.</p>
<p>The second seems quite flexible, although we will need to package the emulator to run it on workers. There also appears to be an emulator built into QEMU in the works which would be easiest to configure.</p>
openQA Tests - action #15678 (Resolved): [LTP][OpenQA] misc: acpi_test_dev_callback failshttps://progress.opensuse.org/issues/156782016-12-29T09:16:53Zrpalethorperichard.palethorpe@suse.com
<p>The ltp_acpi tests fails when running a test inside the ltp_acpi_cmds kernel module called acpi_test_dev_callback.</p>
<p><a href="https://openqa.suse.de/tests/686455#step/run_ltp/45" class="external">https://openqa.suse.de/tests/686455#step/run_ltp/45</a></p>
openQA Tests - action #15668 (Resolved): [kernel][LTP][OpenQA] hyperthreading ht_interrupt won't runhttps://progress.opensuse.org/issues/156682016-12-28T15:53:28Zrpalethorperichard.palethorpe@suse.com
<p>Test fails with TCONF claiming system does not have HT enabled. However the CPU has the HT flag and shows 8 processors configured on a 4 core CPU.</p>
<p>The file <code>/testcases/kernel/sched/hyperthreading/ht_interrupt/ht_utils.c</code> (AFAICT) checks the <code>/proc/cpuinfo</code> file for the number of logical processors vs the number sockets/cores/threads and also looks for a line starting with <code>cpu_package</code>. If it finds <code>cpu_package</code> then it assumes that this is a Hyperthreading kernel. No such line is present on my system.</p>
<p>The other tests in the HT runfile, smt_smp_enable and smt_smp_affinity, run and pass, they have their own copies of <code>ht_utils.c</code> which are different.</p>
openQA Tests - action #15652 (Closed): [LTP][OpenQA] commands: mkfs.ntfs missinghttps://progress.opensuse.org/issues/156522016-12-27T11:26:45Zrpalethorperichard.palethorpe@suse.com
<p>The ntfsprogs package appears to be missing in SLES 12 onwards. It exists in OpenSUSE, so we could at least run the tests there, but then we start on the path of installing different sets of packages in SLES and OpenSUSE. Furthermore I am not sure that we care about NTFS support.</p>
openQA Tests - action #15626 (Resolved): [LTP][OpenQA] commands: du -a test failshttps://progress.opensuse.org/issues/156262016-12-22T12:46:21Zrpalethorperichard.palethorpe@suse.com
<p><a href="https://openqa.suse.de/tests/686450#step/run_ltp/111" class="external">https://openqa.suse.de/tests/686450#step/run_ltp/111</a></p>
openQA Tests - action #15624 (Resolved): [LTP][OpenQA] sssd: sss_* commands not foundhttps://progress.opensuse.org/issues/156242016-12-22T12:42:05Zrpalethorperichard.palethorpe@suse.com
<p>It seems that SUSE does not have commands (they are not in sssd-tools) such as sss_useradd. We probably just use regular <code>useradd</code> with the correct PAM modules configured.</p>
openQA Tests - action #15622 (Resolved): [LTP][OpenQA] commands: mail test claims mail fail is no...https://progress.opensuse.org/issues/156222016-12-22T12:36:35Zrpalethorperichard.palethorpe@suse.com
<p><a href="https://openqa.suse.de/tests/686450#step/run_ltp/56" class="external">https://openqa.suse.de/tests/686450#step/run_ltp/56</a></p>
<p><a href="https://openqa.suse.de/tests/latest?flavor=Server-DVD&distri=sle&machine=64bit&test=ltp_commands&version=12-SP3&arch=x86_64" class="external">latest</a></p>
openQA Tests - action #15492 (Resolved): Upgrade ppc64le workers to QEMU 2.6.*, i.e. current Leap...https://progress.opensuse.org/issues/154922016-12-14T11:06:38Zrpalethorperichard.palethorpe@suse.com
<a name="observation"></a>
<h2 >observation<a href="#observation" class="wiki-anchor">¶</a></h2>
<p>This job (<a href="https://openqa.suse.de/tests/668033/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/668033/file/autoinst-log.txt</a>) fails because the logfile parameter is not available in the installed version of QEMU on the worker.</p>
<a name="problem"></a>
<h2 >problem<a href="#problem" class="wiki-anchor">¶</a></h2>
<p>Upgrading the QEMU version on the worker will fix this. But for this we would need to update e.g. malbec.arch from SLES 12 SP1 to a more recent version which no one did, maybe for good reasons.</p>
<a name="workaround"></a>
<h2 >workaround<a href="#workaround" class="wiki-anchor">¶</a></h2>
<p>The virtio-console is optional in os-autoinst and is only enabled if the job states 'VIRTIO_CONSOLE=1'. As a workaround disable this setting.</p>
openQA Project - action #14690 (Resolved): Live stream for serial terminalhttps://progress.opensuse.org/issues/146902016-11-08T14:32:21Zrpalethorperichard.palethorpe@suse.com
<p>Replace the live SUT video feed in the OpenQA UI with a scrolling text display when a serial terminal is set as the active console.</p>
<p>Currently when the user selects a serial console a stale screen shot of the last used VNC console is shown. The live log below still updates, but the user experience is significantly degraded.</p>