openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842022-03-29T14:11:54ZopenSUSE Project Management Tool
Redmine openQA Project - action #109190 (New): Invalid reusage of VLAN-Tag in multi-machine scenario, whe...https://progress.opensuse.org/issues/1091902022-03-29T14:11:54Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<ul>
<li>The job <a href="http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23" class="external">http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23</a> show a
<code>eth0: IPv4 duplicate address 10.0.2.11 detected (in use by 52:54:00:12:00:56)!</code> message</li>
<li>The mac 52:54:00:12:00:56 belongs to <a href="http://openqa-3.wicked.suse.de/admin/workers/86" class="external">http://openqa-3.wicked.suse.de/admin/workers/86</a> which runs
the job <a href="http://openqa-3.wicked.suse.de/tests/80207" class="external">http://openqa-3.wicked.suse.de/tests/80207</a> during that time.</li>
</ul>
<p>The job 80207 is a multi machine job and the parent is <a href="http://openqa-3.wicked.suse.de/tests/80206" class="external">http://openqa-3.wicked.suse.de/tests/80206</a>, which start at <code>2022-03-29T09:27:13.195681+02:00</code> and end at <code>2022-03-29T10:04:05.167885+02:00</code>, while job 80207 ends at <code>[2022-03-29T10:06:21.469947+02:00]</code>.</p>
<p>The failing job show the qemu command at: <code>[2022-03-29T10:04:26.913030+02:00] [debug] starting: /usr/bin/qemu-system-x86_64 -vga cirrus -only-migratable...</code>, thus<br>
it start a qemu instance with the same VLAN which is still used by job 80207.</p>
<p>Simple reproducer, create two parallel boot jobs</p>
<pre><code>id=$(openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_parent' | jq -r '.id')
echo "PARENT_ID:$id"
openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_child' \
"_PARALLEL_JOBS=$id"
</code></pre> openQA Tests - action #63589 (Resolved): [kernel][public cloud] Fix prepare_tools - azure-cli 2.2...https://progress.opensuse.org/issues/635892020-02-19T10:45:08Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-Installer-DVD-POST-x86_64-create_hdd_autoyast_pc@64bit fails in<br>
<a href="https://openqa.suse.de/tests/3900709/modules/prepare_tools/steps/63" class="external">prepare_tools</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3900709" class="external">0001</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3870285" class="external">0001</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Installer-DVD-POST&machine=64bit&test=create_hdd_autoyast_pc&version=15" class="external">latest</a></p>
openQA Project - action #58826 (Resolved): Result not rendered in detail view on short (e.g. <10s...https://progress.opensuse.org/issues/588262019-10-29T13:52:57Zcfconradcfamullaconrad@suse.com
<p>This was discovered during investigation of poo#39845.</p>
<p>The problem is, if a test-module run very short time, the result isn't rendered in detail view and only "None" gets displayed.<br>
Once the job finished the correct result gets displayed.</p>
<p>This is how it looks in "not expected" state:<br>
<img src="http://imagebin.suse.de/2480/img" alt="http://imagebin.suse.de/2480/img" /></p>
<p>Increase the duration of once test-module to >10s (DO_NOT_FAIL=1) it looks like:<br>
<img src="http://imagebin.suse.de/2484/img" alt="http://imagebin.suse.de/2484/img" /></p>
<a name="Reproduce"></a>
<h1 >Reproduce<a href="#Reproduce" class="wiki-anchor">¶</a></h1>
<p><a href="https://github.com/cfconrad/os-autoinst-distri-opensuse/commit/b2204b65b15459654d531b1dfd6221aab296a3f7" class="external">https://github.com/cfconrad/os-autoinst-distri-opensuse/commit/b2204b65b15459654d531b1dfd6221aab296a3f7</a></p>
<p>run with:<br>
<code>CLEMIX_EXCLUDE='^(?!no_res)' CLEMIX_NO_BOOT=1</code></p>
openQA Tests - action #58697 (Resolved): [kernel][tools] test fails in install_ltphttps://progress.opensuse.org/issues/586972019-10-25T12:38:45Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>The issue happens only on arm. The idea is, that it takes to long for the here tag to setup the input.<br>
The serial output looks mangled like this:</p>
<pre><code># cat > /tmp/scriptfY1mJ.sh << 'EOT_fY1mJ'; echo fY1mJ-$?-
grep -c 'menuentry .SLES \?12-SP5.*(ima_policy=tcb)' /boot/grub2/grub.cfg
EOT_fY1mJ
echo fY1mJ; bash -oe pipefail /tmp/scriptfY1mJ.sh ; echo SCRIPT_FINISHEDfY1mJ-$?-
> grep -c 'menuentry .SLES \?12-SP5.*(ima_policy=tcb)' /boot/grub2/grub.cfg
> EOT_fY1mJ
fY1mJ-0-
# echo fY1mJ; bash -oe pipefail /tmp/scriptfY1mJ.sh ; echo SCRIPT_FINISHEDfY1mJ-$?-
fY1mJ
3
SCRIPT_FINISHEDfY1mJ-0-
</code></pre>
<p>Which result in wrong return value from <code>script_output()</code>.</p>
<p>openQA test in scenario sle-12-SP5-Server-DVD-aarch64-install_ltp+sle+Server-DVD+KOTD@aarch64-virtio fails in<br>
<a href="https://openqa.suse.de/tests/3509151/modules/install_ltp/steps/91" class="external">install_ltp</a></p>
openQA Tests - action #58589 (Resolved): [kernel][public cloud] Add debugging logs to run_ltp testhttps://progress.opensuse.org/issues/585892019-10-23T10:20:50Zcfconradcfamullaconrad@suse.com
<p>We see failed tests in LTP sometimes. But without having serial log or dmesg output, there is no way to understand whats going on. <br>
Especially if it happen only one time, like: <a href="https://openqa.suse.de/tests/3508220#step/cve-2017-18075/1" class="external">https://openqa.suse.de/tests/3508220#step/cve-2017-18075/1</a></p>
openQA Project - action #58100 (Workable): HashKeyQuotes: force no quotes for names containing "_"https://progress.opensuse.org/issues/581002019-10-14T07:45:19Zcfconradcfamullaconrad@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Currently we allow quotes for names containing "<u>". From perl perspective a name containing '</u>' is still a simple identifier and can be used without quotes for hashkey.</p>
<p>From <a href="https://perldoc.perl.org/perldata.html" class="external">https://perldoc.perl.org/perldata.html</a> :</p>
<pre><code>The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges
for its left-hand operand to be interpreted as a string if it's a bareword that would be a legal simple
identifier.
</code></pre>
<p>So we will end up with a regex like this:</p>
<pre><code>/^[a-zA-Z][0-9a-zA-Z_]*$/
</code></pre>
<p>Changing it, produce perlcritic violations, so a cleanup is needed as well.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> hash keys containing <code>_</code> are accepted without surrounding quotes</li>
<li><strong>AC2:</strong> Adopted tidy rules have been applied to os-autoinst and downstream os-autoinst-distri-opensuse</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Change existing tidy checks within os-autoinst</li>
<li>Ensure os-autoinst code adheres to the new rules</li>
<li>Apply the same for os-autoinst-distri-opensuse</li>
</ul>
openQA Tests - action #56471 (Resolved): [kernel][publiccloud][flavor~"^GCE"] check permissions f...https://progress.opensuse.org/issues/564712019-09-05T08:13:29Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-GCE-BYOS-x86_64-publiccloud_boottime@gce_n1_standard_2 fails in<br>
<a href="https://openqa.suse.de/tests/3320341/modules/boottime/steps/64" class="external">boottime</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Test will measure boot time of SLE image inside Public Cloud providers ( Amazon,Microsoft, Google ) </p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3320341" class="external">0.9.1-1.8</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3312415" class="external">0.9.1-1.5</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=GCE-BYOS&machine=gce_n1_standard_2&test=publiccloud_boottime&version=12-SP5" class="external">latest</a></p>
<a name="Hints"></a>
<h2 >Hints<a href="#Hints" class="wiki-anchor">¶</a></h2>
<p>We moved all departments into vault namespaces. Maybe <a class="user active user-mention" href="https://progress.opensuse.org/users/30028">@cfconrad</a> made a mistake by creating <code>qa-kernel/</code> namespace in vault (publiccloud.qa.suse.de)</p>
<pre><code># terraform apply -no-color myplan ; echo B~vFt-$?-
random_id.service[0]: Creating...
random_id.service[0]: Creation complete after 0s [id=pIiK4qR_r98]
google_compute_instance.openqa[0]: Creating...
Error: Error waiting for instance to create: The user does not have access to service account 'vaultopenqa-role-1567427855@suse-sle-qa.iam.gserviceaccount.com'. User: 'vaultopenqa-role-1567427855@suse-sle-qa.iam.gserviceaccount.com'. Ask a project owner to grant you the iam.serviceAccountUser role on the service account
on plan.tf line 59, in resource "google_compute_instance" "openqa":
59: resource "google_compute_instance" "openqa" {
B~vFt-1-
</code></pre> openQA Tests - action #56141 (Closed): [kernel][publiccloud] Influx db on openqa-perf.qa.suse.de ...https://progress.opensuse.org/issues/561412019-08-30T08:37:03Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-Azure-BYOS-x86_64-publiccloud_boottime@az_Standard_A2_v2 fails in<br>
<a href="https://openqa.suse.de/tests/3309920/modules/boottime/steps/103" class="external">boottime</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Test will measure boot time of SLE image inside Public Cloud providers ( Amazon,Microsoft, Google ) </p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3286544" class="external">0.9.0-1.22</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3241011" class="external">2.4</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Azure-BYOS&machine=az_Standard_A2_v2&test=publiccloud_boottime&version=12-SP5" class="external">latest</a></p>
openQA Tests - action #56036 (Resolved): [kernel][ltp] Fix killall call in cgroup_fj_stress_cpuhttps://progress.opensuse.org/issues/560362019-08-28T09:40:03Zcfconradcfamullaconrad@suse.com
<p>Discovered problem on s390x, that <code>killall</code> doesn't reach each process.<br>
Maybe a race of creating proc-fs entry.</p>
<p>Possible solution could be, to collect PIDs and kill them separately.</p>
openQA Tests - action #55883 (Resolved): known_issue:"Can.t fcntl" [kernel] Investigate qemu left...https://progress.opensuse.org/issues/558832019-08-23T09:07:19Zcfconradcfamullaconrad@suse.com
<p>When <a href="https://github.com/os-autoinst/os-autoinst/pull/1182" class="external">https://github.com/os-autoinst/os-autoinst/pull/1182</a> was deployed, we encountered a lot of <code>incomplete</code> jobs on o3.<br>
There seems to be a correlation between <code>save_memory_dump()</code> and qemu left overs, if this PR was included.</p>
<p>This ticked <a href="https://progress.opensuse.org/issues/55595" class="external">https://progress.opensuse.org/issues/55595</a> show some investigation on <code>save_memory_dump()</code> in short, I think it was that <code>xz</code> return with 2 on warnings. And we have <code>use autodie :all</code> in qemu.pm which then just let this process die() and the job end's with ´incomplete`.</p>
<p>So I guess that qemu sometimes have problem in exact that "unexpected" end, to cleanup and kill all qemu instances.</p>
<p>The investigation documented in <a href="https://progress.opensuse.org/issues/55505" class="external">https://progress.opensuse.org/issues/55505</a> pinpoints to the PR#1182 and also mention a reproducer like:</p>
<pre><code>openqa-clone-job --from https://openqa.opensuse.org/tests/1006973 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/poo55505_migration_incompletes WORKER_CLASS=qemu_x86_64_tw
</code></pre>
<p>Unfortunately, I wasn't able to reproduce it till now...</p>
openQA Tests - action #55427 (Rejected): [kernel][public cloud] Investigate ec2 image upload erro...https://progress.opensuse.org/issues/554272019-08-13T09:07:23Zcfconradcfamullaconrad@suse.com
<p>Image upload is ok, but during cleanup we get following error message:</p>
<pre><code>Created image: ami-074c17922d96d6876
An error occurred (DependencyViolation) when calling the DeleteSecurityGroup operation: resource sg-0cd4d10d680677a98 has a dependent object
</code></pre>
<p>Just re-trigger works for now, as the image is uploaded and the next run just found it.</p>
openQA Tests - action #54962 (Closed): [kernel][public cloud] Timeout on `img-proof --version`https://progress.opensuse.org/issues/549622019-08-01T12:10:55Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-Azure-Standard-On-Demand-x86_64-publiccloud_ipa_on_demand_azure@az_Standard_A2_v2 fails in<br>
<a href="https://openqa.suse.de/tests/3198792/modules/ipa/steps/101" class="external">ipa</a></p>
<p>The problem is, that when you take a look to <a href="https://openqa.suse.de/tests/3198792/file/serial_terminal.txt" class="external">https://openqa.suse.de/tests/3198792/file/serial_terminal.txt</a> you will find the full output like:</p>
<pre><code># cat > /tmp/script4b0jz.sh << 'EOT_4b0jz'; echo 4b0jz-$?-
> img-proof --version
> EOT_4b0jz
4b0jz-0-
# echo 4b0jz; bash -oe pipefail /tmp/script4b0jz.sh ; echo SCRIPT_FINISHED4b0jz-$?-
4b0jz
img-proof, version 4.2.1
SCRIPT_FINISHED4b0jz-0-
</code></pre>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3168177" class="external">1.50</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: (unknown) (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?version=12-SP5&arch=x86_64&distri=sle&test=publiccloud_ipa_on_demand_azure&machine=az_Standard_A2_v2&flavor=Azure-Standard-On-Demand" class="external">latest</a></p>
openQA Tests - action #51101 (Resolved): [kernel][public cloud] Set IPA test result to OKhttps://progress.opensuse.org/issues/511012019-05-05T20:27:13Zcfconradcfamullaconrad@suse.com
<p>Fail only if there is a unexpected error. But if IPA produce a valid results.json file, we use parse_extra_log() to upload these and calculate the overall result from it.<br>
This should avoid warnings from JDP with untagged failed tests.</p>
openQA Tests - action #51098 (Closed): [kernel][wicked] Copy files to VM without networkhttps://progress.opensuse.org/issues/510982019-05-05T20:20:46Zcfconradcfamullaconrad@suse.com
<p>In wicked testsuite, it would be a nice feature to copy files from and to VM without having a working Network.</p>
openQA Project - action #39845 (Resolved): Results of tests with very short duration (~<10s) are ...https://progress.opensuse.org/issues/398452018-08-16T10:13:28Zcfconradcfamullaconrad@suse.com
<p>If the execution of the job takes approximately less then 10s the results are not displayed in the openqa web ui.<br>
When enlarge the execution time with "script_run('sleep 8');" results are displayed.</p>
<p>I noticed this only with the ssh backend (<a href="https://github.com/os-autoinst/os-autoinst/pull/1012" class="external">https://github.com/os-autoinst/os-autoinst/pull/1012</a>), which is in development.</p>
<p>Failed job: <a href="http://10.86.1.52/tests/36" class="external">http://10.86.1.52/tests/36</a></p>