openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842020-01-20T08:58:37ZopenSUSE Project Management Tool
Redmine openQA Tests - action #62339 (Rejected): [kernel][ltp] <syscall> slept too long failures in VMshttps://progress.opensuse.org/issues/623392020-01-20T08:58:37Zrpalethorperichard.palethorpe@suse.com
<p>Sometimes timing tests fail with this, especially on ARM. As we are running the tests in VMs on hosts with lots of contention, this is most likely caused by the environment.</p>
openQA Project - action #55751 (Resolved): Formatting for <br> and <code> tags in job description...https://progress.opensuse.org/issues/557512019-08-20T08:14:55Zrpalethorperichard.palethorpe@suse.com
<p>Previously we could write to force a line break in comments. Also we could use tags.</p>
<p>It seems these are now ignored or filtered. See:<br>
<a href="https://openqa.suse.de/group_overview/155" class="external">https://openqa.suse.de/group_overview/155</a></p>
<p>and <a href="https://openqa.suse.de/tests/3262174#comment-195942" class="external">https://openqa.suse.de/tests/3262174#comment-195942</a></p>
<p><em>hint</em> Look at the raw text</p>
<p>For job group descriptions we can switch to using Markdown style code sections if that works. However we need the tags for comments because they are submitted as a single line of text to the openqa cli. Of course someone could fix the cli and newline handling in comments.</p>
openQA Project - action #53891 (Resolved): [openqa] Posting comments results in getting comments ...https://progress.opensuse.org/issues/538912019-07-05T09:22:17Zrpalethorperichard.palethorpe@suse.com
<p>Take the following:</p>
<p>rich@rpws ~> openqa-client --host openqa.opensuse.org --apikey CB3705D3354546E0 --apisecret XXX jobs/975114/comments POST text=test123<br>
[<br>
{<br>
bugrefs => [],<br>
created => "2019-07-05 08:15:47 +0000",<br>
id => 43271,<br>
renderedMarkdown => "update comment test\n",<br>
text => "update comment test",<br>
updated => "2019-07-05 08:45:11 +0000",<br>
userName => "rpalethorpe",<br>
},<br>
]<br>
rich@rpws ~> openqa-client --host <a href="https://openqa.opensuse.org" class="external">https://openqa.opensuse.org</a> --apikey CB3705D3354546E0 --apisecret XXX jobs/975114/comments POST text=test123<br>
{ id => 43287 }</p>
<p>okurz thinks this may be due to <a href="https://github.com/os-autoinst/openQA/pull/2110" class="external">https://github.com/os-autoinst/openQA/pull/2110</a>.</p>
<p>Note that this only happens on O3 and not OSD. I also tried using two different versions of the openqa-client. Also the following works:</p>
<p>openqa-client --host openqa.opensuse.org --apikey CB3705D3354546E0 --apisecret XXX jobs/975114/comments/43271 PUT text="update comment test"<br>
{ id => 43271 }</p>
<p>So the problem maybe only effects POST requests.</p>
openQA Project - action #48182 (Resolved): [openqa] Disable bug carry over for a job grouphttps://progress.opensuse.org/issues/481822019-02-21T09:39:53Zrpalethorperichard.palethorpe@suse.com
<p>Because we have an <a href="https://gitlab.suse.de/rpalethorpe/jdp/blob/master/notebooks/Propagate%20Bug%20Tags.ipynb" class="external">external script</a> for propagating bug tags, we need to remove OpenQA's carry over comments.</p>
<p>In fact OpenQA's carry over comments are almost always wrong for the Kernel group anyway. They have also begun dropping the message stating they are a carry over comment. This makes deleting them more challenging. For example I did not post the following comment <a href="https://openqa.suse.de/tests/2481835#comment-165567" class="external">https://openqa.suse.de/tests/2481835#comment-165567</a> and neither did my script (you can see the bug summary is stale).</p>
<p>Alternatively to disabling the comments, we could parse any existing comments and check if they are doing something suspicious, like tagging a passing test. Then delete/modify those comments, however this could result in legitimate comments being deleted (or modified) in corner cases. The user should be able to override the script so I don't think this is a good idea.</p>
<p>AFAICT it is not currently possible to disable bug carry over for a given job group.</p>
openQA Project - action #38822 (Resolved): Qemu: Could not open backing file: Cannot reference an...https://progress.opensuse.org/issues/388222018-07-25T09:41:54Zrpalethorperichard.palethorpe@suse.com
<p>When trying to revert to a snapshot QEMU dies with the following error or something similar:</p>
<pre><code>-blockdev driver=qcow2,node-name=hd0-overlay1,file=hd0-overlay1-file,cache.no-flush=on,backing=hd0: Could not open backing file: Cannot reference an existing block device with additional options or a new filename
</code></pre>
<p>The backing file is the hd0 block device which is specified on the command line. Possibly we should not specify block devices used as backing files on the command line and just allow them to be read from the overlay file. It is not clear what the expected usage is.</p>
openQA Project - action #36460 (Resolved): [kernel][tools] QEMU Refactor - Performance settingshttps://progress.opensuse.org/issues/364602018-05-23T14:02:30Zrpalethorperichard.palethorpe@suse.com
<p>Decide on cache mode and 'discard'.</p>
openQA Project - action #36034 (Rejected): [kernel][tools] QEMU Refactor - Regression, first Grub...https://progress.opensuse.org/issues/360342018-05-09T11:08:11Zrpalethorperichard.palethorpe@suse.com
<p><a href="http://rpws.suse.cz/tests/237#step/grub_test/5" class="external">http://rpws.suse.cz/tests/237#step/grub_test/5</a></p>
<p>It appears that some files are missing from there expected location, possibly the disk configuration is not stable. Pinning the drive serial numbers may help.</p>
openQA Project - action #19174 (Rejected): [aarch64] Timeouts waiting for QEMU HMP socket during ...https://progress.opensuse.org/issues/191742017-05-16T08:05:34Zrpalethorperichard.palethorpe@suse.com
<p>Sometimes aarch64 tests timeout waiting for a response from QEMU over HMP. In particular <a href="https://openqa.suse.de/tests/933880">https://openqa.suse.de/tests/933880</a>.</p>
<pre><code>06:42:01.1674 1294 ||| finished boot_ltp kernel at 2017-05-16 06:42:01 (126 s)
06:42:01.1686 1294 Creating a VM snapshot lastgood
DIE ERROR: timeout reading hmp socket
at /usr/lib/os-autoinst/backend/baseclass.pm line 73.
backend::baseclass::die_handler('ERROR: timeout reading hmp socket\x{a}') called at /usr/lib/os-autoinst/backend/qemu.pm line 923
backend::qemu::_read_hmp('backend::qemu=HASH(0xd22b550)') called at /usr/lib/os-autoinst/backend/qemu.pm line 991
backend::qemu::_send_hmp('backend::qemu=HASH(0xd22b550)', 'savevm lastgood') called at /usr/lib/os-autoinst/backend/qemu.pm line 212
backend::qemu::save_snapshot('backend::qemu=HASH(0xd22b550)', 'HASH(0xd9baf48)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 68
backend::baseclass::handle_command('backend::qemu=HASH(0xd22b550)', 'HASH(0xd9c08b8)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 422
backend::baseclass::check_socket('backend::qemu=HASH(0xd22b550)', 'IO::Handle=GLOB(0xd64c4d8)') called at /usr/lib/os-autoinst/backend/qemu.pm line 1018
backend::qemu::check_socket('backend::qemu=HASH(0xd22b550)', 'IO::Handle=GLOB(0xd64c4d8)', 0) called at /usr/lib/os-autoinst/backend/baseclass.pm line 203
eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 151
backend::baseclass::run_capture_loop('backend::qemu=HASH(0xd22b550)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 122
backend::baseclass::run('backend::qemu=HASH(0xd22b550)', 6, 9) called at /usr/lib/os-autoinst/backend/driver.pm line 85
backend::driver::start('backend::driver=HASH(0xc535e90)') called at /usr/lib/os-autoinst/backend/driver.pm line 48
backend::driver::new('backend::driver', 'qemu') called at /usr/bin/isotovideo line 206
main::init_backend() called at /usr/bin/isotovideo line 271
06:47:01.2664 1296 waitpid for 1302 returned 0
06:47:01.2665 1296 sending TERM to qemu pid: 1302
06:47:02.2668 1296 waitpid for 1302 returned 0
06:47:02.5449 1288 signalhandler got TERM - loop 1
06:47:02.5451 1288 awaiting death of commands process
06:47:02.5505 1288 commands process exited: 1292
06:47:02.5507 1288 awaiting death of testpid 1294
06:47:02.5588 1288 test process exited: 1294
06:47:02.5589 1288 isotovideo failed
</code></pre> openQA Project - action #16616 (Rejected): ppc64le tests die/timeout while saving snapshothttps://progress.opensuse.org/issues/166162017-02-09T11:01:12Zrpalethorperichard.palethorpe@suse.com
<p>In the following case it clearly shows that the test timed out while waiting for a response from QEMU. In other cases it is not clear to me why the test dies, but it seems to happen at the same point (where a snapshot is saved). I thought there would be an existing ticket for this, but could not find it.</p>
<p><a href="https://openqa.suse.de/tests/762741" class="external">https://openqa.suse.de/tests/762741</a></p>
<a name="Hypothesises"></a>
<h2 >Hypothesises<a href="#Hypothesises" class="wiki-anchor">¶</a></h2>
<ul>
<li>H1, It takes too long to save the snapshot and times out, but would complete if given enough time.</li>
<li>H2, QEMU crashes</li>
<li>H3, The storage is unreachable or broken</li>
<li>H4, The socket is misread by os-autoinst</li>
</ul>
<p>H1 seems the most likely by far.</p>
<a name="Potential-Actions"></a>
<h2 >Potential Actions<a href="#Potential-Actions" class="wiki-anchor">¶</a></h2>
<ul>
<li>A1, Increase the timeout</li>
<li>A2, Increase the storage or compression performance</li>
<li>A3, Stress test OpenQA to recreate the bug and investigate further</li>
</ul>
<p>A1 is easiest, A2 and A3 may be more profitable, but maybe too difficult for now.</p>
<a name="Workarounds"></a>
<h2 >Workarounds<a href="#Workarounds" class="wiki-anchor">¶</a></h2>
<p>Simply restart the test manually.</p>
openQA Project - action #16544 (Rejected): Worker does not terminate when sent TERM signalhttps://progress.opensuse.org/issues/165442017-02-07T11:50:12Zrpalethorperichard.palethorpe@suse.com
<p>When I start a worker with</p>
<p><code>sudo -u _openqa-worker /home/richie/qa/openQA/script/worker --instance 1<br>
--isotovideo ~/qa/os-autoinst/isotovideo --verbose --apikey 1234567890ABCDEF --a<br>
pisecret 1234567890ABCDEF</code></p>
<p>and run a job which fails or completes (more often with a job which fails), the script will not close unless I send the kill signal.</p>
<p>If I press <code>^C</code> then the following is printed:<br>
<code>[INFO] quit due to signal INT</code></p>
<p>If I send <code>kill -TERM <pid></code> then is printed:<br>
<code>[INFO] quit due to signal TERM</code></p>
<p>However the script does not close, sending the kill signal closes the script, but there is still a Perl process active which must also be killed otherwise the pool folder remains locked.</p>
<p>If you have observed a similar problem, please comment, in case it is just my installation (which is from the Git HEAD).</p>
openQA Tests - action #16424 (Rejected): [openqa] main.pm is too oldhttps://progress.opensuse.org/issues/164242017-02-02T16:13:07Zrpalethorperichard.palethorpe@suse.com
<p>If main.pm is not updated then some tests are not scheduled correctly and may fail or run the wrong test.</p>
openQA Tests - action #15700 (Rejected): [LTP][OpenQA] ima,tpm: need TPMhttps://progress.opensuse.org/issues/157002016-12-30T10:26:11Zrpalethorperichard.palethorpe@suse.com
<p>These tests require a Trusted Platform Module which is not currently available inside our SUT's VM. At a cursory glance, there are a few options for solving this, including, but probably not limited to:</p>
<ol>
<li>Pass-through the host's TPM module to the guest.</li>
<li>Emulate the TPM using <a href="https://github.com/PeterHuewe/tpm-emulator" class="external">https://github.com/PeterHuewe/tpm-emulator</a> either on the guest or host.</li>
<li>Wait for QEMU TPM device emulation.</li>
</ol>
<p>In the case of option 1 we need to use the Linux vTPM proxy driver to ensure the guest doesn't take exclusive control of the TPM. This requires reconfiguring the host/worker's kernel to build the vtpmx module.</p>
<p>The second seems quite flexible, although we will need to package the emulator to run it on workers. There also appears to be an emulator built into QEMU in the works which would be easiest to configure.</p>
openQA Tests - action #15652 (Closed): [LTP][OpenQA] commands: mkfs.ntfs missinghttps://progress.opensuse.org/issues/156522016-12-27T11:26:45Zrpalethorperichard.palethorpe@suse.com
<p>The ntfsprogs package appears to be missing in SLES 12 onwards. It exists in OpenSUSE, so we could at least run the tests there, but then we start on the path of installing different sets of packages in SLES and OpenSUSE. Furthermore I am not sure that we care about NTFS support.</p>
openQA Tests - action #15620 (Rejected): [LTP][OpenQA] Crontab activity not foundhttps://progress.opensuse.org/issues/156202016-12-22T12:17:18Zrpalethorperichard.palethorpe@suse.com
<p>cron_tests.sh checks /var/log/messages and /var/log/cron for crontab activity, but neither file exists on newer systems. The test should probably try using <code>journalctl</code> as well.</p>
openQA Project - action #14100 (Rejected): Implement ClientCutText for VNC to speed up sending texthttps://progress.opensuse.org/issues/141002016-10-07T08:47:04Zrpalethorperichard.palethorpe@suse.com
<p>Assuming the backend's VNC server supports *CutText actions we can send text more quickly using the ClientCutText message: <a href="https://tools.ietf.org/html/rfc6143#section-7.5.6" class="external">https://tools.ietf.org/html/rfc6143#section-7.5.6</a></p>
<p>Control flow:</p>
<ol>
<li>Test case calls type_string or perhaps a new call like paste_string</li>
<li>Check the guest is in a state which supports the clipboard</li>
<li>Check the string for any none latin characters or control codes which may break the operation</li>
<li>Send ClientCutText message in VNC.pm</li>
<li>Send the appropriate key sequence to perform paste/yank</li>
</ol>
<p>Similarly ServerCutText can be used to send text in the opposite direction, if the test writer can reliably copy text to the clipboard.</p>
<p>Potential problems:</p>
<ul>
<li>The backends may not support the *CutText operations</li>
<li>It may require a daemon to be running on the guest OS</li>
<li>Not all software supports the clipboard.</li>
</ul>
<p>Advantages:</p>
<ul>
<li>Faster</li>
<li>Won't drop keypresses</li>
<li>May work in most situations</li>
</ul>
<p>I will investigate further if other attempts to speed up text input are not adequate.</p>