openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-10-25T13:29:26ZopenSUSE Project Management Tool
Redmine openQA Project - action #101457 (New): Native per-module bug tagshttps://progress.opensuse.org/issues/1014572021-10-25T13:29:26Zrpalethorperichard.palethorpe@suse.com
<a name="Motivation"></a>
<h1 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h1>
<p>We need to tag individual modules (e.g. LTP tests) within a job. Presently we (kernel qa) do this within job comments using syntax like "test123: bug#123". This requires parsing job comments.</p>
<p>Other teams have different solutions, like parsing external YAML files and marking individual modules as soft-failed.</p>
<p>Providing a single structured data source in OpenQA will simplify reporting and bug tag propagation.</p>
<a name="Goal"></a>
<h1 >Goal<a href="#Goal" class="wiki-anchor">¶</a></h1>
<p>Provide simple interface through OpenQA to:</p>
<ul>
<li>assign a bug to a job module</li>
<li>query the bug assigned to a job module</li>
<li>remove a bug from a job module</li>
</ul>
<p>I think a single reference to one bug tracker is sufficient. Related items in other trackers can be handled by one external tracker (e.g. Redmine).</p>
<a name="Non-Goals"></a>
<h1 >Non-Goals<a href="#Non-Goals" class="wiki-anchor">¶</a></h1>
<ul>
<li>Propagate bugs from one build to the next</li>
<li>Notifications or reporting</li>
</ul>
<a name="Alternatives"></a>
<h1 >Alternatives<a href="#Alternatives" class="wiki-anchor">¶</a></h1>
<ul>
<li>External service and database (e.g <a href="https://gitlab.suse.de/kernel-qa/bugtags" class="external">https://gitlab.suse.de/kernel-qa/bugtags</a>)</li>
</ul>
openQA Tests - action #62339 (Rejected): [kernel][ltp] <syscall> slept too long failures in VMshttps://progress.opensuse.org/issues/623392020-01-20T08:58:37Zrpalethorperichard.palethorpe@suse.com
<p>Sometimes timing tests fail with this, especially on ARM. As we are running the tests in VMs on hosts with lots of contention, this is most likely caused by the environment.</p>
openQA Project - action #40538 (Workable): Reset/Clear guest RAM when it reboots in QEMU to reduc...https://progress.opensuse.org/issues/405382018-09-03T14:22:40Zrpalethorperichard.palethorpe@suse.com
<p>During installation 4GB+ of RAM can be used by the guest. Most of the time the RAM usage is much lower than this.</p>
<p>After installation completes the system is rebooted and then a snapshot is taken. In theory the snapshot should be very small because the system has only just booted, however it appears that QEMU thinks all the RAM is still in use and saves it to the snapshot. This might not be unexpected because on bare metal the RAM is preserved between reboots on modern systems. However, assuming that it is not relied upon by the guest OS, we don't need it to happen and can save some time.</p>
<p>Some ideas to solve this:</p>
<ul>
<li>Use the virtio memory balloon</li>
<li>Use the -no-reboot switch and restart the QEMU process if it exits unexpectedly.</li>
<li>Patch QEMU to clear (some of) the RAM when the guest initiates a reboot.</li>
</ul>
openQA Project - action #40520 (New): SKIPTO fails to load snapshotshttps://progress.opensuse.org/issues/405202018-09-03T10:41:57Zrpalethorperichard.palethorpe@suse.com
<p>There appear to be multiple problems with this feature. In particular when using MAKETESTSNAPSHOTS.</p>
<p>Sometimes loading snapshots works as expected, but others it fails with various different error messages. Some of them from QEMU directly and others from the QEMU backend.</p>
<p>One error from the backend is:<br>
DIE Sequence mismatch while loading 'shutdown-shutdown' snapshot state: 30 != 28 at /home/geekotest/os-autoinst/OpenQA/Qemu/SnapshotConf.pm line 102.</p>
<p>Another from QEMU is:<br>
[2018-09-03T10:33:04.0775 CEST] [debug] QEMU: qemu-system-aarch64: Unknown savevm section or instance '0000:00:06.0/virtio-scsi' 0<br>
[2018-09-03T10:33:04.0775 CEST] [debug] QEMU: qemu-system-aarch64: load of migration failed: Invalid argument</p>
<p>Restarting the same job multiple times with SKIPTO seems to increase the chances of a failure.</p>
openQA Project - action #19174 (Rejected): [aarch64] Timeouts waiting for QEMU HMP socket during ...https://progress.opensuse.org/issues/191742017-05-16T08:05:34Zrpalethorperichard.palethorpe@suse.com
<p>Sometimes aarch64 tests timeout waiting for a response from QEMU over HMP. In particular <a href="https://openqa.suse.de/tests/933880">https://openqa.suse.de/tests/933880</a>.</p>
<pre><code>06:42:01.1674 1294 ||| finished boot_ltp kernel at 2017-05-16 06:42:01 (126 s)
06:42:01.1686 1294 Creating a VM snapshot lastgood
DIE ERROR: timeout reading hmp socket
at /usr/lib/os-autoinst/backend/baseclass.pm line 73.
backend::baseclass::die_handler('ERROR: timeout reading hmp socket\x{a}') called at /usr/lib/os-autoinst/backend/qemu.pm line 923
backend::qemu::_read_hmp('backend::qemu=HASH(0xd22b550)') called at /usr/lib/os-autoinst/backend/qemu.pm line 991
backend::qemu::_send_hmp('backend::qemu=HASH(0xd22b550)', 'savevm lastgood') called at /usr/lib/os-autoinst/backend/qemu.pm line 212
backend::qemu::save_snapshot('backend::qemu=HASH(0xd22b550)', 'HASH(0xd9baf48)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 68
backend::baseclass::handle_command('backend::qemu=HASH(0xd22b550)', 'HASH(0xd9c08b8)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 422
backend::baseclass::check_socket('backend::qemu=HASH(0xd22b550)', 'IO::Handle=GLOB(0xd64c4d8)') called at /usr/lib/os-autoinst/backend/qemu.pm line 1018
backend::qemu::check_socket('backend::qemu=HASH(0xd22b550)', 'IO::Handle=GLOB(0xd64c4d8)', 0) called at /usr/lib/os-autoinst/backend/baseclass.pm line 203
eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 151
backend::baseclass::run_capture_loop('backend::qemu=HASH(0xd22b550)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 122
backend::baseclass::run('backend::qemu=HASH(0xd22b550)', 6, 9) called at /usr/lib/os-autoinst/backend/driver.pm line 85
backend::driver::start('backend::driver=HASH(0xc535e90)') called at /usr/lib/os-autoinst/backend/driver.pm line 48
backend::driver::new('backend::driver', 'qemu') called at /usr/bin/isotovideo line 206
main::init_backend() called at /usr/bin/isotovideo line 271
06:47:01.2664 1296 waitpid for 1302 returned 0
06:47:01.2665 1296 sending TERM to qemu pid: 1302
06:47:02.2668 1296 waitpid for 1302 returned 0
06:47:02.5449 1288 signalhandler got TERM - loop 1
06:47:02.5451 1288 awaiting death of commands process
06:47:02.5505 1288 commands process exited: 1292
06:47:02.5507 1288 awaiting death of testpid 1294
06:47:02.5588 1288 test process exited: 1294
06:47:02.5589 1288 isotovideo failed
</code></pre> openQA Project - action #16506 (Resolved): [easy hack] Use of uninitialized value with isotovideo...https://progress.opensuse.org/issues/165062017-02-06T14:45:11Zrpalethorperichard.palethorpe@suse.com
<p>run <code>isotovideo --help</code> and observe:</p>
<p><code>Use of uninitialized value $r in concatenation (.) or string at qa/os-autoinst/isotovideo line 537.<br>
23685: EXIT<br>
Use of uninitialized value $? in scalar assignment at qa/os-autoinst/isotovideo line 538.</code></p>
openQA Tests - action #15678 (Resolved): [LTP][OpenQA] misc: acpi_test_dev_callback failshttps://progress.opensuse.org/issues/156782016-12-29T09:16:53Zrpalethorperichard.palethorpe@suse.com
<p>The ltp_acpi tests fails when running a test inside the ltp_acpi_cmds kernel module called acpi_test_dev_callback.</p>
<p><a href="https://openqa.suse.de/tests/686455#step/run_ltp/45" class="external">https://openqa.suse.de/tests/686455#step/run_ltp/45</a></p>
openQA Tests - action #15668 (Resolved): [kernel][LTP][OpenQA] hyperthreading ht_interrupt won't runhttps://progress.opensuse.org/issues/156682016-12-28T15:53:28Zrpalethorperichard.palethorpe@suse.com
<p>Test fails with TCONF claiming system does not have HT enabled. However the CPU has the HT flag and shows 8 processors configured on a 4 core CPU.</p>
<p>The file <code>/testcases/kernel/sched/hyperthreading/ht_interrupt/ht_utils.c</code> (AFAICT) checks the <code>/proc/cpuinfo</code> file for the number of logical processors vs the number sockets/cores/threads and also looks for a line starting with <code>cpu_package</code>. If it finds <code>cpu_package</code> then it assumes that this is a Hyperthreading kernel. No such line is present on my system.</p>
<p>The other tests in the HT runfile, smt_smp_enable and smt_smp_affinity, run and pass, they have their own copies of <code>ht_utils.c</code> which are different.</p>
openQA Tests - action #15652 (Closed): [LTP][OpenQA] commands: mkfs.ntfs missinghttps://progress.opensuse.org/issues/156522016-12-27T11:26:45Zrpalethorperichard.palethorpe@suse.com
<p>The ntfsprogs package appears to be missing in SLES 12 onwards. It exists in OpenSUSE, so we could at least run the tests there, but then we start on the path of installing different sets of packages in SLES and OpenSUSE. Furthermore I am not sure that we care about NTFS support.</p>
openQA Tests - action #15626 (Resolved): [LTP][OpenQA] commands: du -a test failshttps://progress.opensuse.org/issues/156262016-12-22T12:46:21Zrpalethorperichard.palethorpe@suse.com
<p><a href="https://openqa.suse.de/tests/686450#step/run_ltp/111" class="external">https://openqa.suse.de/tests/686450#step/run_ltp/111</a></p>
openQA Tests - action #15624 (Resolved): [LTP][OpenQA] sssd: sss_* commands not foundhttps://progress.opensuse.org/issues/156242016-12-22T12:42:05Zrpalethorperichard.palethorpe@suse.com
<p>It seems that SUSE does not have commands (they are not in sssd-tools) such as sss_useradd. We probably just use regular <code>useradd</code> with the correct PAM modules configured.</p>
openQA Tests - action #15622 (Resolved): [LTP][OpenQA] commands: mail test claims mail fail is no...https://progress.opensuse.org/issues/156222016-12-22T12:36:35Zrpalethorperichard.palethorpe@suse.com
<p><a href="https://openqa.suse.de/tests/686450#step/run_ltp/56" class="external">https://openqa.suse.de/tests/686450#step/run_ltp/56</a></p>
<p><a href="https://openqa.suse.de/tests/latest?flavor=Server-DVD&distri=sle&machine=64bit&test=ltp_commands&version=12-SP3&arch=x86_64" class="external">latest</a></p>
openQA Tests - action #15492 (Resolved): Upgrade ppc64le workers to QEMU 2.6.*, i.e. current Leap...https://progress.opensuse.org/issues/154922016-12-14T11:06:38Zrpalethorperichard.palethorpe@suse.com
<a name="observation"></a>
<h2 >observation<a href="#observation" class="wiki-anchor">¶</a></h2>
<p>This job (<a href="https://openqa.suse.de/tests/668033/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/668033/file/autoinst-log.txt</a>) fails because the logfile parameter is not available in the installed version of QEMU on the worker.</p>
<a name="problem"></a>
<h2 >problem<a href="#problem" class="wiki-anchor">¶</a></h2>
<p>Upgrading the QEMU version on the worker will fix this. But for this we would need to update e.g. malbec.arch from SLES 12 SP1 to a more recent version which no one did, maybe for good reasons.</p>
<a name="workaround"></a>
<h2 >workaround<a href="#workaround" class="wiki-anchor">¶</a></h2>
<p>The virtio-console is optional in os-autoinst and is only enabled if the job states 'VIRTIO_CONSOLE=1'. As a workaround disable this setting.</p>
openQA Project - action #14690 (Resolved): Live stream for serial terminalhttps://progress.opensuse.org/issues/146902016-11-08T14:32:21Zrpalethorperichard.palethorpe@suse.com
<p>Replace the live SUT video feed in the OpenQA UI with a scrolling text display when a serial terminal is set as the active console.</p>
<p>Currently when the user selects a serial console a stale screen shot of the last used VNC console is shown. The live log below still updates, but the user experience is significantly degraded.</p>
openQA Project - coordination #14626 (New): [epic] backend and console capabilities interface to ...https://progress.opensuse.org/issues/146262016-11-03T13:28:48Zrpalethorperichard.palethorpe@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Prevent "if/else" in tests needing to distinguish different backends</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> No obvious "if/else" for different types of consoles in os-autoinst-distri-opensuse are necessary anymore</li>
<li><strong>AC2:</strong> Same as <em>AC1</em> for different <em>backends</em></li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read what had been done in <a href="https://github.com/os-autoinst/os-autoinst/pull/1232">https://github.com/os-autoinst/os-autoinst/pull/1232</a> and <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8718">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8718</a> to define "persistent" consoles</li>
<li>Incorporate content from <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/Utils/Backends.pm">https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/Utils/Backends.pm</a> into os-autoinst as flags on backends rather than if/else in test code</li>
<li>Look for other "if/else" code in test distributions, e.g. os-autoinst-distri-opensuse", distinguishing different backends and consoles to provide as capabilities on backends/consoles</li>
</ul>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<a name="Background"></a>
<h3 >Background<a href="#Background" class="wiki-anchor">¶</a></h3>
<p>In an ideal world all the backends (QEMU, bare metal, Xen) and consoles (VNC, serial or hybrid) would be accessed in a uniform manner by testapi so that the distribution and test writers could write their test case once and then have it run across all available platforms without modification. In practice however different Operating systems, hardware, hypervisors and console combinations differ significantly enough in behaviour that a completely uniform API is not possible without either significantly disadvantaging some platforms or providing support for edge cases in the distribution itself.</p>
<p>While the functions in testapi can be kept mostly uniform in availability and behaviour it requires that the distribution handles changes in the Operating System's behaviour due to the machine (virtual or physical) which it is running on and what user interface (console) is selected. Many things can be abstracted away into the console or backend classes in os-autoinst, however OS specific behaviour can not be without making os-autoinst specific to one type of OS or even Linux distribution. Currently the SUSE os-autoinst distribution handles differences between backends by reading variables to determine which backend or architecture is being used in a variety of different places and performing some particular action for that backend. Unfortunately this doesn't just happen in (suse)distribution.pm or other modules in the lib folder, but throughout the test cases.</p>
<p>The problem with branching on a particular architecture or backend is that the contents of the branch statement may actually apply to a whole class of backends not just one. Thus by restricting it to one particular backend you have missed an opportunity to maximise the benefit of your code, which will lead to duplication of effort. However in some cases it may be wasted effort to try inventing general abstractions when they will only be used in one or two instances, but then that is a universal problem, we just have to make a judgement on each and every case.</p>
<a name="Proposal"></a>
<h3 >Proposal<a href="#Proposal" class="wiki-anchor">¶</a></h3>
<p>At any rate my proposal is to introduce the notion of capabilities which can apply to consoles or backends. Any console or backend should have to declare its capabilities in a standard way which can then be read by the distribution and in some very rare cases, the distribution's test modules. Capabilities should be validated against a central list in the appropriate base class, attempting to access or set a capability which does not exist should be an error. This should make them better structured than simply adding more global variables which already serve this purpose to some extent. A list of quirks could also be maintained to indicate negative platform attributes.</p>
<p>The actual implementation could be done using a Perl map, object mixins from some Perl OO library or something else.</p>
<a name="Further-rambling"></a>
<h3 >Further rambling<a href="#Further-rambling" class="wiki-anchor">¶</a></h3>
<p>In the case of the serial terminal feature, I would remove the new testapi function I have added called is_serial_terminal and instead replace it with one or more console/backend capabilities. Perhaps something like <code>direct_read_text</code> and <code>direct_write_text</code> which indicates to the distribution that we are reading and writing raw text to the terminal, the backend would of course require a serial port capability to activate the console. The Linux VNC text console would have something like <code>redirect_write_text</code> which indicates we can redirect output to the serial port and some other capabilities to indicate that we can use needles and send key presses. Any console on some other OS/hardware combo which doesn't support serial ports will be missing the capabilities which indicate we can do this so either <code>run_script</code> and <code>wait_serial</code> will return an error indicating the missing capability or the distribution will have to implement the testapi functions using some other capabilities or workarounds.</p>
<p>Along with having capabilities comes the idea of having interfaces to take advantage of them, so that a backend, console and distribution with compatible capabilities can be plugged together. Such interfaces and their associated capabilities can be invented and implemented on a rolling basis rather than attempting to do some massive overhaul of the code base. This may slow down feature development for some time, but will eventually speed it up and creates a basis for a backend/console plugin architecture. I am willing to implement this in so far that it is required for <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: Add virtio serial console backend and API (Resolved)" href="https://progress.opensuse.org/issues/14582">#14582</a> and other platform specific features I think that the LTP, and other test suites I may work with, can take advantage of.</p>
<a name="Alternatives"></a>
<h3 >Alternatives<a href="#Alternatives" class="wiki-anchor">¶</a></h3>
<p>The alternatives are to forgo taking advantage of platform specific features or add the occasional function to the testapi like <code>is_serial_terminal</code> and use the existing vars mechanism. At least for what I am currently doing, the latter choice is acceptable to me, but it is not extensible beyond a point. There is also the console proxy feature which allows you to tightly couple your test module to a particular console implementation completely bypassing all layers of abstraction while at the same time obfuscating the code flow using Perl meta programming which should be avoided at least from tests perspective.</p>
<a name="Problems"></a>
<h3 >Problems<a href="#Problems" class="wiki-anchor">¶</a></h3>
<p>It is more difficult to identify and isolate a class of behaviour shared by multiple entities and create an abstraction to encapsulate it than just to write code for a specific case. Sometimes people may attempt to create capabilities when there is no significant advantage to doing so or they may be tempted not to when there clearly is an advantage. The feature will need documenting and require effort on the part of reviewers to learn it and enforce its use. It will probably increase the codebases complexity initially until it has been reasonably taken advantage of. There is the danger of an explosion in capabilities which makes it difficult to write a new distribution which covers multiple platforms without understanding a large number of them. Regressions may be introduced while moving backends and consoles over to this system.</p>