openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842022-03-29T14:11:54ZopenSUSE Project Management Tool
Redmine openQA Project - action #109190 (New): Invalid reusage of VLAN-Tag in multi-machine scenario, whe...https://progress.opensuse.org/issues/1091902022-03-29T14:11:54Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<ul>
<li>The job <a href="http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23" class="external">http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23</a> show a
<code>eth0: IPv4 duplicate address 10.0.2.11 detected (in use by 52:54:00:12:00:56)!</code> message</li>
<li>The mac 52:54:00:12:00:56 belongs to <a href="http://openqa-3.wicked.suse.de/admin/workers/86" class="external">http://openqa-3.wicked.suse.de/admin/workers/86</a> which runs
the job <a href="http://openqa-3.wicked.suse.de/tests/80207" class="external">http://openqa-3.wicked.suse.de/tests/80207</a> during that time.</li>
</ul>
<p>The job 80207 is a multi machine job and the parent is <a href="http://openqa-3.wicked.suse.de/tests/80206" class="external">http://openqa-3.wicked.suse.de/tests/80206</a>, which start at <code>2022-03-29T09:27:13.195681+02:00</code> and end at <code>2022-03-29T10:04:05.167885+02:00</code>, while job 80207 ends at <code>[2022-03-29T10:06:21.469947+02:00]</code>.</p>
<p>The failing job show the qemu command at: <code>[2022-03-29T10:04:26.913030+02:00] [debug] starting: /usr/bin/qemu-system-x86_64 -vga cirrus -only-migratable...</code>, thus<br>
it start a qemu instance with the same VLAN which is still used by job 80207.</p>
<p>Simple reproducer, create two parallel boot jobs</p>
<pre><code>id=$(openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_parent' | jq -r '.id')
echo "PARENT_ID:$id"
openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_child' \
"_PARALLEL_JOBS=$id"
</code></pre> openQA Tests - action #66116 (Resolved): [qac][public cloud] Use VM id instead of name for Azurehttps://progress.opensuse.org/issues/661162020-04-27T21:49:40Zcfconradcfamullaconrad@suse.com
<p>In azure a VM can be specified by resource group (rg) plus instance name<br>
or by instance ID.<br>
With this patch we move from rg+name to ID, which make later usage more<br>
easy, but also breaks the implicit assumption that name and rg is always<br>
equal.</p>
openQA Tests - action #64782 (Resolved): [public cloud][kernel] GCE failed in upload_imagehttps://progress.opensuse.org/issues/647822020-03-24T23:24:19Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP2-GCE-x86_64-publiccloud_upload_img@gce_n1_standard_2 fails in<br>
<a href="https://openqa.suse.de/tests/4032758/modules/upload_image/steps/60" class="external">upload_image</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Upload a public-cloud image to the CSP. How the image get's uploaded depends on the CSP. If the image already exists, the upload gets skipped.</p>
<p>Maintainer: <a href="mailto:cfamullaconrad@suse.com">cfamullaconrad@suse.com</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/4030617" class="external">0012</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/4023584" class="external">0011</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=GCE&machine=gce_n1_standard_2&test=publiccloud_upload_img&version=15-SP2" class="external">latest</a></p>
openQA Tests - action #64643 (Resolved): [qac][public cloud] EC2-HVM-ARM image upload missing bil...https://progress.opensuse.org/issues/646432020-03-19T20:41:33Zcfconradcfamullaconrad@suse.com
<p>We need to use <code>--use-root-swap</code> when uploading "on-demand" images to EC2. This is not possible for ARM images, because of a bug in ec2uploadutils:</p>
<p><a href="https://bugzilla.suse.com/show_bug.cgi?id=1167148" class="external">https://bugzilla.suse.com/show_bug.cgi?id=1167148</a></p>
<pre><code>ec2uploadimg --access-id "$AWS_ACCESS_KEY_ID" -s "$AWS_SECRET_ACCESS_KEY" --backing-store ssd --grub2 --machine 'arm64' -n 'cfconrad-SLES15-SP2.aarch64-0.9.9-EC2-HVM-Build1.24.raw.xz' --virt-type hvm --sriov-support --use-root-swap --ena-support --verbose --regions 'eu-central-1' --ssh-key-pair 'openqa1584575383_0' --private-key-file QA_SSH_KEY.pem -d 'OpenQA tests' --ec2-ami ami-0df4259a0762ee347 -t a1.large --vpc-subnet-id subnet-b44c72df 'SLES15-SP2.aarch64-0.9.9-EC2-HVM-Build1.24.raw.xz' ;
</code></pre>
<p><code>subnet-b44c72df</code> is specified, cause otherwise it try to create it in Availability Zone <code>eu-central-1c</code> and there are no <code>a1.large</code> instance available.</p>
<p>The <code>ec2uploadimg</code> call fail with:</p>
<pre><code>Could not find disk device in helper instance with path /dev/sdf or /dev/xvdf
</code></pre> openQA Tests - action #64144 (Resolved): [qac][public cloud] GCE upload fail with ResumableUpload...https://progress.opensuse.org/issues/641442020-03-03T18:18:55Zcfconradcfamullaconrad@suse.com
<p>ResumableUploadAbortException: 403 <a href="mailto:vaultopenqa-role-1580980463@suse-sle-qa.iam.gserviceaccount.com">vaultopenqa-role-1580980463@suse-sle-qa.iam.gserviceaccount.com</a> does not have storage.objects.delete access to openqa-suse-de/SLES15-SP2-BYOS.x86_64-0.9.3-GCE-Build2.26.tar.gz.</p>
<p><a href="https://openqa.suse.de/tests/3948523#step/upload_image/55" class="external">https://openqa.suse.de/tests/3948523#step/upload_image/55</a></p>
openQA Tests - action #63589 (Resolved): [kernel][public cloud] Fix prepare_tools - azure-cli 2.2...https://progress.opensuse.org/issues/635892020-02-19T10:45:08Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-Installer-DVD-POST-x86_64-create_hdd_autoyast_pc@64bit fails in<br>
<a href="https://openqa.suse.de/tests/3900709/modules/prepare_tools/steps/63" class="external">prepare_tools</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3900709" class="external">0001</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3870285" class="external">0001</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Installer-DVD-POST&machine=64bit&test=create_hdd_autoyast_pc&version=15" class="external">latest</a></p>
openQA Tests - action #60725 (Resolved): [kernel]Can't use an undefined value as a symbol referen...https://progress.opensuse.org/issues/607252019-12-05T10:55:30Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-Server-DVD-Incidents-Kernel-s390x-ltp_syscalls@s390x-kvm-sle12 fails in<br>
<a href="https://openqa.suse.de/tests/3662283/modules/boot_ltp/steps/24" class="external">boot_ltp</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>A large collection of tests for individual system calls. This is generally considered to be the most important test suit within the LTP.</p>
<p>The IPC tests have been filtered out from the syscalls runtest file. They are ran as part of the syscalls_ipc test case.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3662283" class="external">:13400:kernel-ec2</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3660987" class="external">4.12.14-142.1.g6e7819a</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=s390x&distri=sle&flavor=Server-DVD-Incidents-Kernel&machine=s390x-kvm-sle12&test=ltp_syscalls&version=12-SP5" class="external">latest</a></p>
openQA Tests - action #58697 (Resolved): [kernel][tools] test fails in install_ltphttps://progress.opensuse.org/issues/586972019-10-25T12:38:45Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>The issue happens only on arm. The idea is, that it takes to long for the here tag to setup the input.<br>
The serial output looks mangled like this:</p>
<pre><code># cat > /tmp/scriptfY1mJ.sh << 'EOT_fY1mJ'; echo fY1mJ-$?-
grep -c 'menuentry .SLES \?12-SP5.*(ima_policy=tcb)' /boot/grub2/grub.cfg
EOT_fY1mJ
echo fY1mJ; bash -oe pipefail /tmp/scriptfY1mJ.sh ; echo SCRIPT_FINISHEDfY1mJ-$?-
> grep -c 'menuentry .SLES \?12-SP5.*(ima_policy=tcb)' /boot/grub2/grub.cfg
> EOT_fY1mJ
fY1mJ-0-
# echo fY1mJ; bash -oe pipefail /tmp/scriptfY1mJ.sh ; echo SCRIPT_FINISHEDfY1mJ-$?-
fY1mJ
3
SCRIPT_FINISHEDfY1mJ-0-
</code></pre>
<p>Which result in wrong return value from <code>script_output()</code>.</p>
<p>openQA test in scenario sle-12-SP5-Server-DVD-aarch64-install_ltp+sle+Server-DVD+KOTD@aarch64-virtio fails in<br>
<a href="https://openqa.suse.de/tests/3509151/modules/install_ltp/steps/91" class="external">install_ltp</a></p>
openQA Tests - action #58589 (Resolved): [kernel][public cloud] Add debugging logs to run_ltp testhttps://progress.opensuse.org/issues/585892019-10-23T10:20:50Zcfconradcfamullaconrad@suse.com
<p>We see failed tests in LTP sometimes. But without having serial log or dmesg output, there is no way to understand whats going on. <br>
Especially if it happen only one time, like: <a href="https://openqa.suse.de/tests/3508220#step/cve-2017-18075/1" class="external">https://openqa.suse.de/tests/3508220#step/cve-2017-18075/1</a></p>
openQA Project - action #58100 (Workable): HashKeyQuotes: force no quotes for names containing "_"https://progress.opensuse.org/issues/581002019-10-14T07:45:19Zcfconradcfamullaconrad@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Currently we allow quotes for names containing "<u>". From perl perspective a name containing '</u>' is still a simple identifier and can be used without quotes for hashkey.</p>
<p>From <a href="https://perldoc.perl.org/perldata.html" class="external">https://perldoc.perl.org/perldata.html</a> :</p>
<pre><code>The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges
for its left-hand operand to be interpreted as a string if it's a bareword that would be a legal simple
identifier.
</code></pre>
<p>So we will end up with a regex like this:</p>
<pre><code>/^[a-zA-Z][0-9a-zA-Z_]*$/
</code></pre>
<p>Changing it, produce perlcritic violations, so a cleanup is needed as well.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> hash keys containing <code>_</code> are accepted without surrounding quotes</li>
<li><strong>AC2:</strong> Adopted tidy rules have been applied to os-autoinst and downstream os-autoinst-distri-opensuse</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Change existing tidy checks within os-autoinst</li>
<li>Ensure os-autoinst code adheres to the new rules</li>
<li>Apply the same for os-autoinst-distri-opensuse</li>
</ul>
openQA Tests - action #56471 (Resolved): [kernel][publiccloud][flavor~"^GCE"] check permissions f...https://progress.opensuse.org/issues/564712019-09-05T08:13:29Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-GCE-BYOS-x86_64-publiccloud_boottime@gce_n1_standard_2 fails in<br>
<a href="https://openqa.suse.de/tests/3320341/modules/boottime/steps/64" class="external">boottime</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Test will measure boot time of SLE image inside Public Cloud providers ( Amazon,Microsoft, Google ) </p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3320341" class="external">0.9.1-1.8</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3312415" class="external">0.9.1-1.5</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=GCE-BYOS&machine=gce_n1_standard_2&test=publiccloud_boottime&version=12-SP5" class="external">latest</a></p>
<a name="Hints"></a>
<h2 >Hints<a href="#Hints" class="wiki-anchor">¶</a></h2>
<p>We moved all departments into vault namespaces. Maybe <a class="user active user-mention" href="https://progress.opensuse.org/users/30028">@cfconrad</a> made a mistake by creating <code>qa-kernel/</code> namespace in vault (publiccloud.qa.suse.de)</p>
<pre><code># terraform apply -no-color myplan ; echo B~vFt-$?-
random_id.service[0]: Creating...
random_id.service[0]: Creation complete after 0s [id=pIiK4qR_r98]
google_compute_instance.openqa[0]: Creating...
Error: Error waiting for instance to create: The user does not have access to service account 'vaultopenqa-role-1567427855@suse-sle-qa.iam.gserviceaccount.com'. User: 'vaultopenqa-role-1567427855@suse-sle-qa.iam.gserviceaccount.com'. Ask a project owner to grant you the iam.serviceAccountUser role on the service account
on plan.tf line 59, in resource "google_compute_instance" "openqa":
59: resource "google_compute_instance" "openqa" {
B~vFt-1-
</code></pre> openQA Tests - action #56462 (Resolved): [qac][public cloud] Improve boottime test - Adopt thresh...https://progress.opensuse.org/issues/564622019-09-04T13:23:05Zcfconradcfamullaconrad@suse.com
<p>Our current boot time threashold need to be Provider specific.<br>
This was aggregated with the PC team in the last meeting on 2020-03-16.</p>
<p>Also the guestregister is that slow, because of the provided SCC API. And currently PC-Team doesn't see any improvement possibilities from there side.</p>
openQA Tests - action #56036 (Resolved): [kernel][ltp] Fix killall call in cgroup_fj_stress_cpuhttps://progress.opensuse.org/issues/560362019-08-28T09:40:03Zcfconradcfamullaconrad@suse.com
<p>Discovered problem on s390x, that <code>killall</code> doesn't reach each process.<br>
Maybe a race of creating proc-fs entry.</p>
<p>Possible solution could be, to collect PIDs and kill them separately.</p>
openQA Tests - action #52919 (Resolved): [qac][public cloud] Add unit tests to PCW https://progress.opensuse.org/issues/529192019-06-12T08:02:25Zcfconradcfamullaconrad@suse.com
<p>We need unit tests in <a href="https://github.com/cfconrad/pcw" class="external">https://github.com/cfconrad/pcw</a></p>
<p>Hints:</p>
<ul>
<li><a href="https://streaming.nue.suse.com/i/dcm/shap/2019-06-05-python-tips-tricks.mp4" class="external">https://streaming.nue.suse.com/i/dcm/shap/2019-06-05-python-tips-tricks.mp4</a></li>
</ul>
openQA Tests - action #51101 (Resolved): [kernel][public cloud] Set IPA test result to OKhttps://progress.opensuse.org/issues/511012019-05-05T20:27:13Zcfconradcfamullaconrad@suse.com
<p>Fail only if there is a unexpected error. But if IPA produce a valid results.json file, we use parse_extra_log() to upload these and calculate the overall result from it.<br>
This should avoid warnings from JDP with untagged failed tests.</p>