openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842022-03-29T14:11:54ZopenSUSE Project Management Tool
Redmine openQA Project - action #109190 (New): Invalid reusage of VLAN-Tag in multi-machine scenario, whe...https://progress.opensuse.org/issues/1091902022-03-29T14:11:54Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<ul>
<li>The job <a href="http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23" class="external">http://openqa-3.wicked.suse.de/tests/80283#step/before_test/23</a> show a
<code>eth0: IPv4 duplicate address 10.0.2.11 detected (in use by 52:54:00:12:00:56)!</code> message</li>
<li>The mac 52:54:00:12:00:56 belongs to <a href="http://openqa-3.wicked.suse.de/admin/workers/86" class="external">http://openqa-3.wicked.suse.de/admin/workers/86</a> which runs
the job <a href="http://openqa-3.wicked.suse.de/tests/80207" class="external">http://openqa-3.wicked.suse.de/tests/80207</a> during that time.</li>
</ul>
<p>The job 80207 is a multi machine job and the parent is <a href="http://openqa-3.wicked.suse.de/tests/80206" class="external">http://openqa-3.wicked.suse.de/tests/80206</a>, which start at <code>2022-03-29T09:27:13.195681+02:00</code> and end at <code>2022-03-29T10:04:05.167885+02:00</code>, while job 80207 ends at <code>[2022-03-29T10:06:21.469947+02:00]</code>.</p>
<p>The failing job show the qemu command at: <code>[2022-03-29T10:04:26.913030+02:00] [debug] starting: /usr/bin/qemu-system-x86_64 -vga cirrus -only-migratable...</code>, thus<br>
it start a qemu instance with the same VLAN which is still used by job 80207.</p>
<p>Simple reproducer, create two parallel boot jobs</p>
<pre><code>id=$(openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_parent' | jq -r '.id')
echo "PARENT_ID:$id"
openqa-cli api --host http://openqa-3.wicked.suse.de -X POST jobs 'ARCH=x86_64' 'DISTRI=opensuse' 'FLAVOR=CI' 'MACHINE=x86_64' 'VERSION=Tumbleweed' '_GROUP_ID=0' \
'BOOT_HDD_IMAGE=1' 'DESKTOP=textmode' 'HDD_1=tumbleweed.qcow2' 'KEEP_GRUB_TIMEOUT=1' \
'BACKEND=qemu' 'NICTYPE=tap' 'WORKER_CLASS=tap,qemu_x86_64' \
'SCHEDULE=tests/boot/boot_to_desktop' 'TEST=check_vlan_on_mm_job_child' \
"_PARALLEL_JOBS=$id"
</code></pre> openQA Project - action #58826 (Resolved): Result not rendered in detail view on short (e.g. <10s...https://progress.opensuse.org/issues/588262019-10-29T13:52:57Zcfconradcfamullaconrad@suse.com
<p>This was discovered during investigation of poo#39845.</p>
<p>The problem is, if a test-module run very short time, the result isn't rendered in detail view and only "None" gets displayed.<br>
Once the job finished the correct result gets displayed.</p>
<p>This is how it looks in "not expected" state:<br>
<img src="http://imagebin.suse.de/2480/img" alt="http://imagebin.suse.de/2480/img" /></p>
<p>Increase the duration of once test-module to >10s (DO_NOT_FAIL=1) it looks like:<br>
<img src="http://imagebin.suse.de/2484/img" alt="http://imagebin.suse.de/2484/img" /></p>
<a name="Reproduce"></a>
<h1 >Reproduce<a href="#Reproduce" class="wiki-anchor">¶</a></h1>
<p><a href="https://github.com/cfconrad/os-autoinst-distri-opensuse/commit/b2204b65b15459654d531b1dfd6221aab296a3f7" class="external">https://github.com/cfconrad/os-autoinst-distri-opensuse/commit/b2204b65b15459654d531b1dfd6221aab296a3f7</a></p>
<p>run with:<br>
<code>CLEMIX_EXCLUDE='^(?!no_res)' CLEMIX_NO_BOOT=1</code></p>
openQA Project - action #58100 (Workable): HashKeyQuotes: force no quotes for names containing "_"https://progress.opensuse.org/issues/581002019-10-14T07:45:19Zcfconradcfamullaconrad@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Currently we allow quotes for names containing "<u>". From perl perspective a name containing '</u>' is still a simple identifier and can be used without quotes for hashkey.</p>
<p>From <a href="https://perldoc.perl.org/perldata.html" class="external">https://perldoc.perl.org/perldata.html</a> :</p>
<pre><code>The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges
for its left-hand operand to be interpreted as a string if it's a bareword that would be a legal simple
identifier.
</code></pre>
<p>So we will end up with a regex like this:</p>
<pre><code>/^[a-zA-Z][0-9a-zA-Z_]*$/
</code></pre>
<p>Changing it, produce perlcritic violations, so a cleanup is needed as well.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> hash keys containing <code>_</code> are accepted without surrounding quotes</li>
<li><strong>AC2:</strong> Adopted tidy rules have been applied to os-autoinst and downstream os-autoinst-distri-opensuse</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Change existing tidy checks within os-autoinst</li>
<li>Ensure os-autoinst code adheres to the new rules</li>
<li>Apply the same for os-autoinst-distri-opensuse</li>
</ul>
openQA Tests - action #56036 (Resolved): [kernel][ltp] Fix killall call in cgroup_fj_stress_cpuhttps://progress.opensuse.org/issues/560362019-08-28T09:40:03Zcfconradcfamullaconrad@suse.com
<p>Discovered problem on s390x, that <code>killall</code> doesn't reach each process.<br>
Maybe a race of creating proc-fs entry.</p>
<p>Possible solution could be, to collect PIDs and kill them separately.</p>
openQA Tests - action #55883 (Resolved): known_issue:"Can.t fcntl" [kernel] Investigate qemu left...https://progress.opensuse.org/issues/558832019-08-23T09:07:19Zcfconradcfamullaconrad@suse.com
<p>When <a href="https://github.com/os-autoinst/os-autoinst/pull/1182" class="external">https://github.com/os-autoinst/os-autoinst/pull/1182</a> was deployed, we encountered a lot of <code>incomplete</code> jobs on o3.<br>
There seems to be a correlation between <code>save_memory_dump()</code> and qemu left overs, if this PR was included.</p>
<p>This ticked <a href="https://progress.opensuse.org/issues/55595" class="external">https://progress.opensuse.org/issues/55595</a> show some investigation on <code>save_memory_dump()</code> in short, I think it was that <code>xz</code> return with 2 on warnings. And we have <code>use autodie :all</code> in qemu.pm which then just let this process die() and the job end's with ´incomplete`.</p>
<p>So I guess that qemu sometimes have problem in exact that "unexpected" end, to cleanup and kill all qemu instances.</p>
<p>The investigation documented in <a href="https://progress.opensuse.org/issues/55505" class="external">https://progress.opensuse.org/issues/55505</a> pinpoints to the PR#1182 and also mention a reproducer like:</p>
<pre><code>openqa-clone-job --from https://openqa.opensuse.org/tests/1006973 CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/poo55505_migration_incompletes WORKER_CLASS=qemu_x86_64_tw
</code></pre>
<p>Unfortunately, I wasn't able to reproduce it till now...</p>
openQA Tests - action #55427 (Rejected): [kernel][public cloud] Investigate ec2 image upload erro...https://progress.opensuse.org/issues/554272019-08-13T09:07:23Zcfconradcfamullaconrad@suse.com
<p>Image upload is ok, but during cleanup we get following error message:</p>
<pre><code>Created image: ami-074c17922d96d6876
An error occurred (DependencyViolation) when calling the DeleteSecurityGroup operation: resource sg-0cd4d10d680677a98 has a dependent object
</code></pre>
<p>Just re-trigger works for now, as the image is uploaded and the next run just found it.</p>
openQA Tests - action #54962 (Closed): [kernel][public cloud] Timeout on `img-proof --version`https://progress.opensuse.org/issues/549622019-08-01T12:10:55Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP5-Azure-Standard-On-Demand-x86_64-publiccloud_ipa_on_demand_azure@az_Standard_A2_v2 fails in<br>
<a href="https://openqa.suse.de/tests/3198792/modules/ipa/steps/101" class="external">ipa</a></p>
<p>The problem is, that when you take a look to <a href="https://openqa.suse.de/tests/3198792/file/serial_terminal.txt" class="external">https://openqa.suse.de/tests/3198792/file/serial_terminal.txt</a> you will find the full output like:</p>
<pre><code># cat > /tmp/script4b0jz.sh << 'EOT_4b0jz'; echo 4b0jz-$?-
> img-proof --version
> EOT_4b0jz
4b0jz-0-
# echo 4b0jz; bash -oe pipefail /tmp/script4b0jz.sh ; echo SCRIPT_FINISHED4b0jz-$?-
4b0jz
img-proof, version 4.2.1
SCRIPT_FINISHED4b0jz-0-
</code></pre>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3168177" class="external">1.50</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: (unknown) (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?version=12-SP5&arch=x86_64&distri=sle&test=publiccloud_ipa_on_demand_azure&machine=az_Standard_A2_v2&flavor=Azure-Standard-On-Demand" class="external">latest</a></p>
openQA Tests - action #52919 (Resolved): [qac][public cloud] Add unit tests to PCW https://progress.opensuse.org/issues/529192019-06-12T08:02:25Zcfconradcfamullaconrad@suse.com
<p>We need unit tests in <a href="https://github.com/cfconrad/pcw" class="external">https://github.com/cfconrad/pcw</a></p>
<p>Hints:</p>
<ul>
<li><a href="https://streaming.nue.suse.com/i/dcm/shap/2019-06-05-python-tips-tricks.mp4" class="external">https://streaming.nue.suse.com/i/dcm/shap/2019-06-05-python-tips-tricks.mp4</a></li>
</ul>
openQA Tests - action #51101 (Resolved): [kernel][public cloud] Set IPA test result to OKhttps://progress.opensuse.org/issues/511012019-05-05T20:27:13Zcfconradcfamullaconrad@suse.com
<p>Fail only if there is a unexpected error. But if IPA produce a valid results.json file, we use parse_extra_log() to upload these and calculate the overall result from it.<br>
This should avoid warnings from JDP with untagged failed tests.</p>
openQA Tests - action #51098 (Closed): [kernel][wicked] Copy files to VM without networkhttps://progress.opensuse.org/issues/510982019-05-05T20:20:46Zcfconradcfamullaconrad@suse.com
<p>In wicked testsuite, it would be a nice feature to copy files from and to VM without having a working Network.</p>
openQA Tests - action #45758 (Rejected): [qac][public cloud] Get hard boot time limits for specif...https://progress.opensuse.org/issues/457582019-01-07T10:27:52Zcfconradcfamullaconrad@suse.com
<p>For the test modules, we should have fixed boot time limits for specific CSPs.</p>
<p>Q: The boot time is VM type related, do we need to take this into account?</p>
openQA Project - action #44654 (Resolved): [tool] Summarize results in test details tabhttps://progress.opensuse.org/issues/446542018-12-03T13:35:29Zcfconradcfamullaconrad@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>To get a quick overview in test details the idea is to show grouped results in tab on details page, somehow similar to what we have in tests list: <img src="https://progress.opensuse.org/attachments/download/7238/overview.png" alt="short summary" loading="lazy" />.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> test details tabs shows test module results summary</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<p>The initial idea was to have it for "external results", but it could also apply to details, which need to be discussed. We also show the number of comments in the comment tab.</p>
openQA Tests - action #41768 (Rejected): DPDK on azurehttps://progress.opensuse.org/issues/417682018-09-28T09:09:58Zcfconradcfamullaconrad@suse.com
<p><a href="https://docs.microsoft.com/en-us/azure/virtual-network/setup-dpdk" class="external">https://docs.microsoft.com/en-us/azure/virtual-network/setup-dpdk</a></p>
<p>Run performance tests and store results in dashboard.</p>
<p>DPKD is available till SLES15.</p>
openQA Project - action #40913 (Resolved): script_output sometimes fail on virtio consolehttps://progress.opensuse.org/issues/409132018-09-12T09:34:38Zcfconradcfamullaconrad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>It seams randomly happen in openQA, that tests fail, cause "cat -" never finish.</p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<p>Observations on openQA</p>
<ul>
<li><a href="https://progress.opensuse.org/issues/30613#note-23" class="external">https://progress.opensuse.org/issues/30613#note-23</a> </li>
<li><a href="https://openqa.suse.de/tests/2031979#step/boot_ltp/78" class="external">https://openqa.suse.de/tests/2031979#step/boot_ltp/78</a></li>
</ul>
<p>I was able to bring my openQA instance in such a state. I'm actually not sure, if this<br>
is the same problem as we have in osd, but it looks similar. The big different is, that if it happen once,<br>
it happen always for that worker.</p>
<p>What I did so far:</p>
<ul>
<li>Start a test which is using virtio console</li>
<li>restart openQA while the test is running</li>
<li>run tests again</li>
</ul>
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>A call like this:</p>
<pre><code>cat - > /tmp/script8RI3l.sh; echo 8RI3l-$?-
</code></pre>
<p>Doesn't get the EOT and so we never reach the prompt again.</p>
<a name="Suggestion"></a>
<h2 >Suggestion<a href="#Suggestion" class="wiki-anchor">¶</a></h2>
<p>We need deeper investigations.</p>
openQA Project - action #39845 (Resolved): Results of tests with very short duration (~<10s) are ...https://progress.opensuse.org/issues/398452018-08-16T10:13:28Zcfconradcfamullaconrad@suse.com
<p>If the execution of the job takes approximately less then 10s the results are not displayed in the openqa web ui.<br>
When enlarge the execution time with "script_run('sleep 8');" results are displayed.</p>
<p>I noticed this only with the ssh backend (<a href="https://github.com/os-autoinst/os-autoinst/pull/1012" class="external">https://github.com/os-autoinst/os-autoinst/pull/1012</a>), which is in development.</p>
<p>Failed job: <a href="http://10.86.1.52/tests/36" class="external">http://10.86.1.52/tests/36</a></p>