https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-09-16T04:01:18ZopenSUSE Project Management ToolopenQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4463792021-09-16T04:01:18Zrfan1richard.fan@suse.com
<ul></ul><p>rfan1 wrote:</p>
<blockquote>
<p>The issue can be rarely seen on other platforms [Re-run the tests can fix the issue], but we can see it on aarch64 platform many times, not sure if any performance issue with arm worker.</p>
<p>We used to publish a qcow2 hdd image during our tests, and this image can be used for later tests,however the hdd can't be booted up due to wrong format</p>
<p>For example:<br>
<a href="http://openqa.nue.suse.com/tests/7103389#downloads" class="external">http://openqa.nue.suse.com/tests/7103389#downloads</a></p>
<p>In this case, the test passed without any issue. but the qcow2 image seems not bootable<br>
qemu-img info <a href="mailto:sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2">sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2</a><br>
image: <a href="mailto:sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2">sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2</a><br>
file format: raw<br>
virtual size: 2.43 GiB (2607030272 bytes)<br>
disk size: 2.43 GiB</p>
<p>Can someone help take a look at this issue?</p>
</blockquote>
<p>BTW, this issue was fixed later [we switched to another arm worker and the issue was gone finally]</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4464632021-09-16T07:24:14Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i> [sle][aarch64] the published hdd can't be booted up due to wrong format</i> to <i>[tools][sle][aarch64] the published hdd can't be booted up due to wrong format</i></li><li><strong>Category</strong> set to <i>Infrastructure</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>Feedback</i></li><li><strong>Assignee</strong> set to <i>okurz</i></li><li><strong>Target version</strong> set to <i>Ready</i></li></ul><p>rfan1 wrote:</p>
<blockquote>
<p>BTW, this issue was fixed later [we switched to another arm worker and the issue was gone finally]</p>
</blockquote>
<p>sounds like a workaround.</p>
<p>Could you please adapt the ticket description according to <a href="https://progress.opensuse.org/projects/openqav3/wiki/#Defects" class="external">https://progress.opensuse.org/projects/openqav3/wiki/#Defects</a> so that we have the necessary information to proceed? I suggest to crosscheck the checksum of assets before and after to see if something goes wrong on generation, transfer or use.</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4464872021-09-16T08:19:39Zmkittlermarius.kittler@suse.com
<ul></ul><p>I assume the <code>*.qcow2</code> file is actually supposed to be qcow2? I'm just asking because we recently introduced a change which would require that the extension is matching the format.</p>
<p>But otherwise it looks like the image has been somehow corrupted. We also got a warning in a related section of the autoinst log:</p>
<pre><code>[2021-09-14T12:59:21.776 CEST] [debug] running nice ionice qemu-img convert -O qcow2 /var/lib/openqa/pool/15/raid/hd0-overlay1 assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2
[2021-09-14T12:59:40.816 CEST] [debug] running qemu-img info --output=json assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2
Use of uninitialized value in subtraction (-) at /usr/lib/os-autoinst/backend/qemu.pm line 514.
backend::qemu::do_extract_assets(backend::qemu=HASH(0xaaaaf6581570), HASH(0xaaaaf502b528)) called at /usr/lib/os-autoinst/backend/driver.pm line 97
backend::driver::extract_assets(backend::driver=HASH(0xaaaaefe41358), HASH(0xaaaaf502b528)) called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Utils.pm line 178
eval {...} called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Utils.pm line 178
OpenQA::Isotovideo::Utils::handle_generated_assets(OpenQA::Isotovideo::CommandHandler=HASH(0xaaaaf6d80420), 1) called at /usr/bin/isotovideo line 420
[2021-09-14T12:59:40.880 CEST] [info] ::: backend::qemu::do_extract_assets: Extracting (?^u:^pflash-vars$)
[2021-09-14T12:59:40.881 CEST] [debug] running nice ionice qemu-img convert -O qcow2 /var/lib/openqa/pool/15/raid/pflash-vars-overlay1 assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64-uefi-vars_sb.qcow2
[2021-09-14T12:59:41.307 CEST] [debug] running qemu-img info --output=json assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64-uefi-vars_sb.qcow2
[2021-09-14T12:59:41.370 CEST] [debug] stopping backend process 79668
[2021-09-14T12:59:41.371 CEST] [debug] done with backend process
79273: EXIT 0
</code></pre>
<p>By the way, has this been happening more often or is this the first occurrence?</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4465352021-09-16T09:06:09Zrfan1richard.fan@suse.com
<ul></ul><p>mkittler wrote:</p>
<blockquote>
<p>I assume the <code>*.qcow2</code> file is actually supposed to be qcow2? I'm just asking because we recently introduced a change which would require that the extension is matching the format.</p>
<p>But otherwise it looks like the image has been somehow corrupted. We also got a warning in a related section of the autoinst log:</p>
<pre><code>[2021-09-14T12:59:21.776 CEST] [debug] running nice ionice qemu-img convert -O qcow2 /var/lib/openqa/pool/15/raid/hd0-overlay1 assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2
[2021-09-14T12:59:40.816 CEST] [debug] running qemu-img info --output=json assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2
Use of uninitialized value in subtraction (-) at /usr/lib/os-autoinst/backend/qemu.pm line 514.
backend::qemu::do_extract_assets(backend::qemu=HASH(0xaaaaf6581570), HASH(0xaaaaf502b528)) called at /usr/lib/os-autoinst/backend/driver.pm line 97
backend::driver::extract_assets(backend::driver=HASH(0xaaaaefe41358), HASH(0xaaaaf502b528)) called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Utils.pm line 178
eval {...} called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Utils.pm line 178
OpenQA::Isotovideo::Utils::handle_generated_assets(OpenQA::Isotovideo::CommandHandler=HASH(0xaaaaf6d80420), 1) called at /usr/bin/isotovideo line 420
[2021-09-14T12:59:40.880 CEST] [info] ::: backend::qemu::do_extract_assets: Extracting (?^u:^pflash-vars$)
[2021-09-14T12:59:40.881 CEST] [debug] running nice ionice qemu-img convert -O qcow2 /var/lib/openqa/pool/15/raid/pflash-vars-overlay1 assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64-uefi-vars_sb.qcow2
[2021-09-14T12:59:41.307 CEST] [debug] running qemu-img info --output=json assets_public/sle-15-SP3-aarch64-187.1-textmode@aarch64-uefi-vars_sb.qcow2
[2021-09-14T12:59:41.370 CEST] [debug] stopping backend process 79668
[2021-09-14T12:59:41.371 CEST] [debug] done with backend process
79273: EXIT 0
</code></pre>
<p>By the way, has this been happening more often or is this the first occurrence?</p>
</blockquote>
<p>Thanks all for the kindly help on this case! the we have met this issue many times before, especially when we run a openqa job with our own branch (very strange result, any restriction with our own branches?)</p>
<p>But, we used to re-run the job and then the issue is fix on x86_platform, but on arm platforms, we hit 5+ times even we tried to re-run the tests again and again, finally, we tried to switch to another worker and issue was gone.</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4468652021-09-17T02:17:57Zrfan1richard.fan@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/446865/diff?detail_id=423717">diff</a>)</li></ul><p>rfan1 wrote:</p>
<blockquote>
<p>The issue can be rarely seen on other platforms [Re-run the tests can fix the issue], but we can see it on aarch64 platform many times, not sure if any performance issue with arm worker.</p>
<p>We used to publish a qcow2 hdd image during our tests, and this image can be used for later tests,however the hdd can't be booted up due to wrong format</p>
<p>For example:<br>
<a href="http://openqa.nue.suse.com/tests/7103389#downloads" class="external">http://openqa.nue.suse.com/tests/7103389#downloads</a></p>
<p>In this case, the test passed without any issue. but the qcow2 image seems not bootable<br>
qemu-img info <a href="mailto:sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2">sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2</a><br>
image: <a href="mailto:sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2">sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2</a><br>
file format: raw<br>
virtual size: 2.43 GiB (2607030272 bytes)<br>
disk size: 2.43 GiB</p>
<p>Can someone help take a look at this issue?</p>
</blockquote>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4488182021-09-23T17:14:18Zokurzokurz@suse.com
<ul></ul><p>hm, I suspect that our workers openqaworker-arm-[123] can be even trusted less than we think.</p>
<p>note to team: I suggest to crosscheck the checksum of assets before and after to see if something goes wrong on generation, transfer or use.</p>
<p>@rfan I think you could help by updating the ticket with a regex matching the error condition to use <a href="https://github.com/os-autoinst/scripts/blob/master/README.md#auto-review---automatically-detect-known-issues-in-openqa-jobs-label-openqa-jobs-with-ticket-references-and-optionally-retrigger" class="external">https://github.com/os-autoinst/scripts/blob/master/README.md#auto-review---automatically-detect-known-issues-in-openqa-jobs-label-openqa-jobs-with-ticket-references-and-optionally-retrigger</a> . Also if you like you can try to generate an image and use accordingly by triggering openQA jobs with <code>WORKER_CLASS=openqaworker-arm-4</code> to force pinning to one of our newer ARM machines to see if the problem appears there as well.</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4488752021-09-24T01:50:21Zrfan1richard.fan@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>hm, I suspect that our workers openqaworker-arm-[123] can be even trusted less than we think.</p>
<p>note to team: I suggest to crosscheck the checksum of assets before and after to see if something goes wrong on generation, transfer or use.</p>
<p>@rfan I think you could help by updating the ticket with a regex matching the error condition to use <a href="https://github.com/os-autoinst/scripts/blob/master/README.md#auto-review---automatically-detect-known-issues-in-openqa-jobs-label-openqa-jobs-with-ticket-references-and-optionally-retrigger" class="external">https://github.com/os-autoinst/scripts/blob/master/README.md#auto-review---automatically-detect-known-issues-in-openqa-jobs-label-openqa-jobs-with-ticket-references-and-optionally-retrigger</a> . Also if you like you can try to generate an image and use accordingly by triggering openQA jobs with <code>WORKER_CLASS=openqaworker-arm-4</code> to force pinning to one of our newer ARM machines to see if the problem appears there as well.</p>
</blockquote>
<p>Thanks Oliver! will do!</p>
<p><a href="https://openqa.suse.de/tests/7214160" class="external">https://openqa.suse.de/tests/7214160</a></p>
<a name="qemu-img-info-sle-15-SP3-aarch64-1871-textmodeaarch64_sbqcow2"></a>
<h1 >qemu-img info <a href="mailto:sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2">sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2</a><a href="#qemu-img-info-sle-15-SP3-aarch64-1871-textmodeaarch64_sbqcow2" class="wiki-anchor">¶</a></h1>
<p>image: <a href="mailto:sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2">sle-15-SP3-aarch64-187.1-textmode@aarch64_sb.qcow2</a><br>
file format: qcow2<br>
virtual size: 20 GiB (21474836480 bytes)<br>
disk size: 3.17 GiB<br>
cluster_size: 65536<br>
Format specific information:<br>
compat: 1.1<br>
compression type: zlib<br>
lazy refcounts: false<br>
refcount bits: 16<br>
corrupt: false<br>
extended l2: false</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4549792021-10-13T09:07:46Zokurzokurz@suse.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Low</i></li></ul><p>as so far no further impact was reported by others I regard this as low prio. <a class="user active user-mention" href="https://progress.opensuse.org/users/34730">@rfan1</a> any update from your side?</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4549972021-10-13T09:47:13Zrfan1richard.fan@suse.com
<ul></ul><p>Thanks Oliver!<br>
Agree with you since the issue is not seen any more with higher performance worker.</p>
openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4556272021-10-15T03:40:42Zrfan1richard.fan@suse.com
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" href="/issues/101015">action #101015</a>: [tools][sle][x86_64][aarch64][QEMUTPM] can openqa create swtpm device automatically? size:M</i> added</li></ul> openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4556332021-10-15T03:42:28Zrfan1richard.fan@suse.com
<ul><li><strong>Copied to</strong> deleted (<i><a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" href="/issues/101015">action #101015</a>: [tools][sle][x86_64][aarch64][QEMUTPM] can openqa create swtpm device automatically? size:M</i>)</li></ul> openQA Project - action #98727: [tools][sle][aarch64] the published hdd can't be booted up due to wrong formathttps://progress.opensuse.org/issues/98727?journal_id=4649452021-11-16T10:42:52Zokurzokurz@suse.com
<ul><li><strong>Project</strong> changed from <i>openQA Tests</i> to <i>openQA Project</i></li><li><strong>Category</strong> changed from <i>Infrastructure</i> to <i>Support</i></li><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>Alright. I still consider it worthwhile to ensure that assets are correctly generated/uploaded/downloaded with checksums. So this can be a potential future improvement. Added in <a class="issue tracker-6 status-1 priority-3 priority-lowest" title="coordination: [epic] Various feature requests (New)" href="https://progress.opensuse.org/issues/65271">#65271</a></p>