openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842022-03-29T14:11:54ZopenSUSE Project Management Tool
Redmine openQA Project - action #109190 (New): Invalid reusage of VLAN-Tag in multi-machine scenario, whe...https://progress.opensuse.org/issues/1091902022-03-29T14:11:54Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<ul>
<li>The job <a href="http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23" class="external">http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23</a> show a
<code>eth0: IPv4 duplicate address 10.0.2.11 detected (in use by 52:54:00:12:00:56)!</code> message</li>
<li>The mac 52:54:00:12:00:56 belongs to <a href="http://openqa-3.wicked.suse.de/admin/workers/86" class="external">http://openqa-3.wicked.suse.de/admin/workers/86</a> which runs
the job <a href="http://openqa-3.wicked.suse.de/tests/80207" class="external">http://openqa-3.wicked.suse.de/tests/80207</a> during that time.</li>
</ul>
<p>The job 80207 is a multi machine job and the parent is <a href="http://openqa-3.wicked.suse.de/tests/80206" class="external">http://openqa-3.wicked.suse.de/tests/80206</a>, which start at <code>2022-03-29T09:27:13.195681+02:00</code> and end at <code>2022-03-29T10:04:05.167885+02:00</code>, while job 80207 ends at <code>[2022-03-29T10:06:21.469947+02:00]</code>.</p>
<p>The failing job show the qemu command at: <code>[2022-03-29T10:04:26.913030+02:00] [debug] starting: /usr/bin/qemu-system-x86_64 -vga cirrus -only-migratable...</code>, thus<br>
it start a qemu instance with the same VLAN which is still used by job 80207.</p>
<p>Simple reproducer, create two parallel boot jobs</p>
<pre><code>id=$(openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_parent' | jq -r '.id')
echo "PARENT_ID:$id"
openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_child' \
"_PARALLEL_JOBS=$id"
</code></pre> openQA Tests - action #100518 (Resolved): test fails in docker_image -- timeout on `docker start`...https://progress.opensuse.org/issues/1005182021-10-07T07:55:02Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP3-Container-Image-Updates-s390x-sle-12-SP3_image_on_sle-15_host_docker@s390x-kvm-sle12 fails in<br>
<a href="https://openqa.suse.de/tests/7333020/modules/docker_image/steps/19" class="external">docker_image</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>The base test suite is used for job templates defined in YAML documents. It has no settings of its own.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/7330977" class="external">24.310</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/7314458" class="external">24.309</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=s390x&distri=sle&flavor=Container-Image-Updates&machine=s390x-kvm-sle12&test=sle-12-SP3_image_on_sle-15_host_docker&version=12-SP3" class="external">latest</a></p>
openQA Tests - action #97208 (Resolved): test fails in prepare_test_data - timeout exceeded of do...https://progress.opensuse.org/issues/972082021-08-19T08:50:44Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-AZURE-Basic-Updates-x86_64-publiccloud_containers@64bit fails in<br>
<a href="https://openqa.suse.de/tests/6890165/modules/prepare_test_data/steps/5" class="external">prepare_test_data</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>The base test suite is used for job templates defined in YAML documents. It has no settings of its own.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/6877116" class="external">20210818-1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/6867963" class="external">20210817-1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=AZURE-Basic-Updates&machine=64bit&test=publiccloud_containers&version=12-SP5" class="external">latest</a></p>
openQA Tests - action #97106 (Resolved): test fails in enable_selinuxhttps://progress.opensuse.org/issues/971062021-08-18T09:38:51Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Command: <code>aureport -a</code></p>
<pre><code># Test died: script failed with : 3tByJ
AVC Report
===============================================================
# date time comm subj syscall class permission obj result event
===============================================================
<no events of interest were found>
SCRIPT_FINISHED3tByJ-1-
at /usr/lib/os-autoinst/testapi.pm line 1153.
</code></pre>
<p>From my POV this isn't a failure, we just didn't hit anything, not sure why <code>aureport -a</code> return 1 in such a case.<br>
<a href="https://github.com/linux-audit/audit-userspace/blob/master/src/aureport.c#L154" class="external">https://github.com/linux-audit/audit-userspace/blob/master/src/aureport.c#L154</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Same test as microos but enabling SELinux after boot.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.opensuse.org/tests/1808622" class="external">20210626</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.opensuse.org/tests/1750717" class="external">20210519</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=microos&flavor=MicroOS-Image&machine=64bit&test=microos_selinux&version=Tumbleweed" class="external">latest</a></p>
openQA Tests - action #63589 (Resolved): [kernel][public cloud] Fix prepare_tools - azure-cli 2.2...https://progress.opensuse.org/issues/635892020-02-19T10:45:08Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-Installer-DVD-POST-x86_64-create_hdd_autoyast_pc@64bit fails in<br>
<a href="https://openqa.suse.de/tests/3900709/modules/prepare_tools/steps/63" class="external">prepare_tools</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3900709" class="external">0001</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3870285" class="external">0001</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Installer-DVD-POST&machine=64bit&test=create_hdd_autoyast_pc&version=15" class="external">latest</a></p>
openQA Project - action #58826 (Resolved): Result not rendered in detail view on short (e.g. <10s...https://progress.opensuse.org/issues/588262019-10-29T13:52:57Zcfconradcfamullaconrad@suse.com
<p>This was discovered during investigation of poo#39845.</p>
<p>The problem is, if a test-module run very short time, the result isn't rendered in detail view and only "None" gets displayed.<br>
Once the job finished the correct result gets displayed.</p>
<p>This is how it looks in "not expected" state:<br>
<img src="http://imagebin.suse.de/2480/img" alt="http://imagebin.suse.de/2480/img" /></p>
<p>Increase the duration of once test-module to >10s (DO_NOT_FAIL=1) it looks like:<br>
<img src="http://imagebin.suse.de/2484/img" alt="http://imagebin.suse.de/2484/img" /></p>
<a name="Reproduce"></a>
<h1 >Reproduce<a href="#Reproduce" class="wiki-anchor">¶</a></h1>
<p><a href="https://github.com/cfconrad/os-autoinst-distri-opensuse/commit/b2204b65b15459654d531b1dfd6221aab296a3f7" class="external">https://github.com/cfconrad/os-autoinst-distri-opensuse/commit/b2204b65b15459654d531b1dfd6221aab296a3f7</a></p>
<p>run with:<br>
<code>CLEMIX_EXCLUDE='^(?!no_res)' CLEMIX_NO_BOOT=1</code></p>
openQA Tests - action #58697 (Resolved): [kernel][tools] test fails in install_ltphttps://progress.opensuse.org/issues/586972019-10-25T12:38:45Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>The issue happens only on arm. The idea is, that it takes to long for the here tag to setup the input.<br>
The serial output looks mangled like this:</p>
<pre><code># cat > /tmp/scriptfY1mJ.sh << 'EOT_fY1mJ'; echo fY1mJ-$?-
grep -c 'menuentry .SLES \?12-SP5.*(ima_policy=tcb)' /boot/grub2/grub.cfg
EOT_fY1mJ
echo fY1mJ; bash -oe pipefail /tmp/scriptfY1mJ.sh ; echo SCRIPT_FINISHEDfY1mJ-$?-
> grep -c 'menuentry .SLES \?12-SP5.*(ima_policy=tcb)' /boot/grub2/grub.cfg
> EOT_fY1mJ
fY1mJ-0-
# echo fY1mJ; bash -oe pipefail /tmp/scriptfY1mJ.sh ; echo SCRIPT_FINISHEDfY1mJ-$?-
fY1mJ
3
SCRIPT_FINISHEDfY1mJ-0-
</code></pre>
<p>Which result in wrong return value from <code>script_output()</code>.</p>
<p>openQA test in scenario sle-12-SP5-Server-DVD-aarch64-install_ltp+sle+Server-DVD+KOTD@aarch64-virtio fails in<br>
<a href="https://openqa.suse.de/tests/3509151/modules/install_ltp/steps/91" class="external">install_ltp</a></p>
openQA Tests - action #58589 (Resolved): [kernel][public cloud] Add debugging logs to run_ltp testhttps://progress.opensuse.org/issues/585892019-10-23T10:20:50Zcfconradcfamullaconrad@suse.com
<p>We see failed tests in LTP sometimes. But without having serial log or dmesg output, there is no way to understand whats going on. <br>
Especially if it happen only one time, like: <a href="https://openqa.suse.de/tests/3508220#step/cve-2017-18075/1" class="external">https://openqa.suse.de/tests/3508220#step/cve-2017-18075/1</a></p>
openQA Project - action #58100 (Workable): HashKeyQuotes: force no quotes for names containing "_"https://progress.opensuse.org/issues/581002019-10-14T07:45:19Zcfconradcfamullaconrad@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Currently we allow quotes for names containing "<u>". From perl perspective a name containing '</u>' is still a simple identifier and can be used without quotes for hashkey.</p>
<p>From <a href="https://perldoc.perl.org/perldata.html" class="external">https://perldoc.perl.org/perldata.html</a> :</p>
<pre><code>The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges
for its left-hand operand to be interpreted as a string if it's a bareword that would be a legal simple
identifier.
</code></pre>
<p>So we will end up with a regex like this:</p>
<pre><code>/^[a-zA-Z][0-9a-zA-Z_]*$/
</code></pre>
<p>Changing it, produce perlcritic violations, so a cleanup is needed as well.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> hash keys containing <code>_</code> are accepted without surrounding quotes</li>
<li><strong>AC2:</strong> Adopted tidy rules have been applied to os-autoinst and downstream os-autoinst-distri-opensuse</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Change existing tidy checks within os-autoinst</li>
<li>Ensure os-autoinst code adheres to the new rules</li>
<li>Apply the same for os-autoinst-distri-opensuse</li>
</ul>
openQA Tests - action #56471 (Resolved): [kernel][publiccloud][flavor~"^GCE"] check permissions f...https://progress.opensuse.org/issues/564712019-09-05T08:13:29Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-GCE-BYOS-x86_64-publiccloud_boottime@gce_n1_standard_2 fails in<br>
<a href="https://openqa.suse.de/tests/3320341/modules/boottime/steps/64" class="external">boottime</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Test will measure boot time of SLE image inside Public Cloud providers ( Amazon,Microsoft, Google ) </p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3320341" class="external">0.9.1-1.8</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3312415" class="external">0.9.1-1.5</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=GCE-BYOS&machine=gce_n1_standard_2&test=publiccloud_boottime&version=12-SP5" class="external">latest</a></p>
<a name="Hints"></a>
<h2 >Hints<a href="#Hints" class="wiki-anchor">¶</a></h2>
<p>We moved all departments into vault namespaces. Maybe <a class="user active user-mention" href="https://progress.opensuse.org/users/30028">@cfconrad</a> made a mistake by creating <code>qa-kernel/</code> namespace in vault (publiccloud.qa.suse.de)</p>
<pre><code># terraform apply -no-color myplan ; echo B~vFt-$?-
random_id.service[0]: Creating...
random_id.service[0]: Creation complete after 0s [id=pIiK4qR_r98]
google_compute_instance.openqa[0]: Creating...
Error: Error waiting for instance to create: The user does not have access to service account 'vaultopenqa-role-1567427855@suse-sle-qa.iam.gserviceaccount.com'. User: 'vaultopenqa-role-1567427855@suse-sle-qa.iam.gserviceaccount.com'. Ask a project owner to grant you the iam.serviceAccountUser role on the service account
on plan.tf line 59, in resource "google_compute_instance" "openqa":
59: resource "google_compute_instance" "openqa" {
B~vFt-1-
</code></pre> openQA Tests - action #56036 (Resolved): [kernel][ltp] Fix killall call in cgroup_fj_stress_cpuhttps://progress.opensuse.org/issues/560362019-08-28T09:40:03Zcfconradcfamullaconrad@suse.com
<p>Discovered problem on s390x, that <code>killall</code> doesn't reach each process.<br>
Maybe a race of creating proc-fs entry.</p>
<p>Possible solution could be, to collect PIDs and kill them separately.</p>
openQA Tests - action #55883 (Resolved): known_issue:"Can.t fcntl" [kernel] Investigate qemu left...https://progress.opensuse.org/issues/558832019-08-23T09:07:19Zcfconradcfamullaconrad@suse.com
<p>When <a href="https://github.com/os-autoinst/os-autoinst/pull/1182" class="external">https://github.com/os-autoinst/os-autoinst/pull/1182</a> was deployed, we encountered a lot of <code>incomplete</code> jobs on o3.<br>
There seems to be a correlation between <code>save_memory_dump()</code> and qemu left overs, if this PR was included.</p>
<p>This ticked <a href="https://progress.opensuse.org/issues/55595" class="external">https://progress.opensuse.org/issues/55595</a> show some investigation on <code>save_memory_dump()</code> in short, I think it was that <code>xz</code> return with 2 on warnings. And we have <code>use autodie :all</code> in qemu.pm which then just let this process die() and the job end's with ´incomplete`.</p>
<p>So I guess that qemu sometimes have problem in exact that "unexpected" end, to cleanup and kill all qemu instances.</p>
<p>The investigation documented in <a href="https://progress.opensuse.org/issues/55505" class="external">https://progress.opensuse.org/issues/55505</a> pinpoints to the PR#1182 and also mention a reproducer like:</p>
<pre><code>openqa-clone-job --from https://openqa.opensuse.org/tests/1006973 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/poo55505_migration_incompletes WORKER_CLASS=qemu_x86_64_tw
</code></pre>
<p>Unfortunately, I wasn't able to reproduce it till now...</p>
openQA Tests - action #52919 (Resolved): [qac][public cloud] Add unit tests to PCW https://progress.opensuse.org/issues/529192019-06-12T08:02:25Zcfconradcfamullaconrad@suse.com
<p>We need unit tests in <a href="https://github.com/cfconrad/pcw" class="external">https://github.com/cfconrad/pcw</a></p>
<p>Hints:</p>
<ul>
<li><a href="https://streaming.nue.suse.com/i/dcm/shap/2019-06-05-python-tips-tricks.mp4" class="external">https://streaming.nue.suse.com/i/dcm/shap/2019-06-05-python-tips-tricks.mp4</a></li>
</ul>
openQA Tests - action #51101 (Resolved): [kernel][public cloud] Set IPA test result to OKhttps://progress.opensuse.org/issues/511012019-05-05T20:27:13Zcfconradcfamullaconrad@suse.com
<p>Fail only if there is a unexpected error. But if IPA produce a valid results.json file, we use parse_extra_log() to upload these and calculate the overall result from it.<br>
This should avoid warnings from JDP with untagged failed tests.</p>
openQA Project - action #39845 (Resolved): Results of tests with very short duration (~<10s) are ...https://progress.opensuse.org/issues/398452018-08-16T10:13:28Zcfconradcfamullaconrad@suse.com
<p>If the execution of the job takes approximately less then 10s the results are not displayed in the openqa web ui.<br>
When enlarge the execution time with "script_run('sleep 8');" results are displayed.</p>
<p>I noticed this only with the ssh backend (<a href="https://github.com/os-autoinst/os-autoinst/pull/1012" class="external">https://github.com/os-autoinst/os-autoinst/pull/1012</a>), which is in development.</p>
<p>Failed job: <a href="http://10.86.1.52/tests/36" class="external">http://10.86.1.52/tests/36</a></p>