openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842023-07-21T08:58:16ZopenSUSE Project Management Tool
Redmine openQA Infrastructure - action #133154 (Resolved): osd-deployment failed because unreachable workershttps://progress.opensuse.org/issues/1331542023-07-21T08:58:16Zosukup
<p><a href="https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/736743" class="external">https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/736743</a></p>
<p>from logs:</p>
<pre><code>sapworker1.qe.nue2.suse.org:
Minion did not return. [Not connected]
openqaworker1.qe.nue2.suse.org:
Minion did not return. [Not connected]
sapworker2.qe.nue2.suse.org:
Minion did not return. [Not connected]
sapworker3.qe.nue2.suse.org:
Minion did not return. [Not connected]
+++ kill %1
</code></pre>
<p>tried to ping/ssh hosts and none of these hosts is reachable<br>
also IPMI is without any response... + this hosts have corresponding host up alert in grapahana.</p>
openQA Infrastructure - action #132926 (Workable): OSD cron -> (fetch_openqa_bugs)> /tmp/fetch_op...https://progress.opensuse.org/issues/1329262023-07-18T07:56:34Zosukup
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>OSD cron -> (fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log failed:</p>
<p>from traceback:</p>
<pre><code>requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /repos/SUSE/ha-sap-terraform-deployments/issues/857 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f7439e43b38>, 'Connection to api.github.com timed out. (connect timeout=10)'))
</code></pre>
<p>fetch_openqa_bug failed when fetch issues from GitHub</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> It is understood why the error occurred</li>
<li><strong>AC2:</strong> The error does not persist</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Make sure you can login, see <a href="https://gitlab.suse.de/OPS-Service/salt/-/blob/production/pillar/id/openqa-service_qe_suse_de.sls#L11" class="external">https://gitlab.suse.de/OPS-Service/salt/-/blob/production/pillar/id/openqa-service_qe_suse_de.sls#L11</a> or ask dheidler/mkittler to do that for you</li>
<li>Assuming "host unavailable', check how long the scripts retried
<ul>
<li>Re-try more often?</li>
<li>Wait longer between attemps? </li>
</ul></li>
<li><a href="https://github.com/os-autoinst/openqa_bugfetcher" class="external">https://github.com/os-autoinst/openqa_bugfetcher</a></li>
</ul>
openQA Infrastructure - action #125228 (Rejected): Salt pillars deployment failed on storage.oqa....https://progress.opensuse.org/issues/1252282023-03-01T12:27:23Zosukup
<pre><code> ID: /root/.ssh/id_ed25519.backup_osd
Function: file.managed
Result: False
Comment: Pillar id_ed25519.backup_osd does not exist
Started: 13:09:31.581660
Duration: 2.844 ms
Changes:
</code></pre> QA - action #111506 (Resolved): qa-tools: qem-bot - Development results leaked to dashboard size:Mhttps://progress.opensuse.org/issues/1115062022-05-24T08:17:00Zosukup
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Development group 'pstivanin-security@Server-DVD-Updates' results synced to qem dashboard</p>
<p>We already filter out development jobs in <a href="https://github.com/openSUSE/qem-bot/blob/master/openqabot/aggrsync.py#L53-L68" class="external">https://github.com/openSUSE/qem-bot/blob/master/openqabot/aggrsync.py#L53-L68</a> in case of aggregates or <a href="https://github.com/openSUSE/qem-bot/blob/master/openqabot/incsyncres.py#L48-L61" class="external">https://github.com/openSUSE/qem-bot/blob/master/openqabot/incsyncres.py#L48-L61</a> but today <a href="https://openqa.suse.de/tests/overview?version=15-SP3&groupid=431&flavor=Server-DVD-Updates&distri=sle&build=20220523-1" class="external">pstivanin-security</a> leaked to qem-dashboard in <a href="http://dashboard.qam.suse.de/incident/24199" class="external">http://dashboard.qam.suse.de/incident/24199</a> when it shouldn't show it.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Development jobs should not be part of incident approval</li>
<li><strong>AC2:</strong> Dashboard no longer shows development jobs</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Don't let the bot submit development jobs to the dashboard</li>
<li>Prevent duplication in <a href="https://github.com/openSUSE/qem-bot/blob/master/openqabot/aggrsync.py#L53-L68" class="external">https://github.com/openSUSE/qem-bot/blob/master/openqabot/aggrsync.py#L53-L68</a> and <a href="https://github.com/openSUSE/qem-bot/blob/master/openqabot/incsyncres.py#L48-L61" class="external">https://github.com/openSUSE/qem-bot/blob/master/openqabot/incsyncres.py#L48-L61</a></li>
<li>Also exclude jobs that are part of any parent job group regardless of the job group names</li>
</ul>
<a name="Further-notes"></a>
<h2 >Further notes<a href="#Further-notes" class="wiki-anchor">¶</a></h2>
<ul>
<li>A "development job" is a job in a job group or in a parent job group where the name contains "Development".</li>
</ul>
QA - action #109977 (Resolved): qem-bot - approve pipeline failed with 403 forbidden size:Mhttps://progress.opensuse.org/issues/1099772022-04-14T10:45:36Zosukup
<p>Gitlab CI <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/929031">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/929031</a></p>
<p>after look to IBS , RR (release request) was accepted before but info about this didn't propagate to dashboard (some latency is excepted IBS->SMELT->Dashboard), so new try to accept accepted RR ends with error</p>
<pre><code>INFO: Accepting review for SUSE:Maintenace:23688:269540
ERROR: HTTP Error 403: Forbidden
Traceback (most recent call last):
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/approver.py", line 117, in osc_approve
message=msg,
File "/usr/lib/python3.6/site-packages/osc/core.py", line 4323, in change_review_state
f = http_POST(u, data=message)
File "/usr/lib/python3.6/site-packages/osc/core.py", line 3422, in http_POST
def http_POST(*args, **kwargs): return http_request('POST', *args, **kwargs)
File "/usr/lib/python3.6/site-packages/osc/core.py", line 3410, in http_request
fd = urlopen(req, data=data)
File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib64/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python3.6/urllib/request.py", line 564, in error
result = self._call_chain(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 1059, in http_error_401
url, req, headers)
File "/usr/lib64/python3.6/urllib/request.py", line 1007, in http_error_auth_reqed
return self.retry_http_basic_auth(host, req, realm)
File "/usr/lib64/python3.6/urllib/request.py", line 1022, in retry_http_basic_auth
return self.parent.open(req, timeout=req.timeout)
File "/usr/lib64/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib64/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
</code></pre>
<a name="Acceptance-criteria"></a>
<h1 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h1>
<ul>
<li><strong>AC1:</strong> Approve runs without error in case RR is already processed</li>
</ul>
<a name="Suggestions"></a>
<h1 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h1>
<ul>
<li>Check response before throwing exception and act accordingly</li>
<li>Check IBS RR if really has open review for <code>qam-openqa</code></li>
<li>The meaning of the 403 response code here is that we try to accept something that is already accepted. How about catching the case where the XML indicates the specific error, logging and ignoring the case.</li>
<li>Log server response (not just return code)</li>
<li>Suggest a better response (and response code) to OBS upstream</li>
</ul>
QA - action #109512 (Resolved): qem-bot - add vars with GitlabCI job link and qem-dashboard linkhttps://progress.opensuse.org/issues/1095122022-04-05T18:11:21Zosukup
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>made easy to find which Gitlab CI job scheduled incident / aggregate in openQA </p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> added variables with clickable URLs to Schedule and QEM dashboard to job variables</li>
</ul>
QA - action #109509 (New): qem-dashboard - show better info about time of actualization of datahttps://progress.opensuse.org/issues/1095092022-04-05T18:07:39Zosukup
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>We already have a "last updated" in the dashboard which can be misleading. </p>
<p>We should have - last time of SMELT sync, last time of Incidents schedule and last time of Aggregates Schedule</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Correct update datetime is shown on dashboard</li>
</ul>
QA - action #109488 (Resolved): qem-bot - better logginghttps://progress.opensuse.org/issues/1094882022-04-05T13:21:31Zosukup
<p>As followup 5-Why's</p>
<p>-> many log lines with "ERROR" about missing repomd.xml -> turn to INFO</p>
<p>-> log message "WARNING: Missing product in /etc/openqabot" -> "DEBUG: Skipping obsolete openQABot config /etc/openqabot/bot.yml"</p>
<p>-> log message "DEBUG: Incident … does not have x86_64 arch in 12-SP3" -> so what? -> maybe we can simply remove that message and ignore that, or move to TRACE</p>
<p>-> log message "DEBUG: No channels in … for …" -> Can we put some hints to the readers there what it means or what they could check, e.g. is there no valid smelt_channel to openQA product mapping in metadata? Maybe incident is obsolete and should be closed, removed, etc.?</p>
<p>-> log message "NOT SCHEDULE:" -> lowercase and use "not scheduling"</p>
<p>-> log message "Project ... can't calculate repohash" -> would be useful to have a timestamp of last update from OBS</p>
<p>-> log message for aggregates "Posting ... jobs" is ambiguous or wrong, should be more like "Triggering ... openQA products" or similar, or "openqa isos post calls"</p>
<p>-> we found a problem with an exception as openQA API returns with 404 on post isos as a product is missing in openQA. This error is ignored and we continue the job. We should handle that better.</p>
<p>-> add a concluding log message after triggering tests, like "Triggering done ... jobs" or so.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria:<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1</strong>: better logs from qem-bot</li>
</ul>
openQA Project - action #109310 (Resolved): qem-bot/dashboard - mixed old and new incidents size:Mhttps://progress.opensuse.org/issues/1093102022-03-31T11:13:01Zosukup
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Maintenance sometimes re-uses old incidents instead of creating new ones for package which leads to mixed results in dashboard :(</p>
<p>see: <a href="https://suse.slack.com/archives/C02D16TCP99/p1648721562205869">https://suse.slack.com/archives/C02D16TCP99/p1648721562205869</a></p>
<p>So we need workaround/solution for this corner case</p>
<p>See also <a href="https://github.com/openSUSE/qem-dashboard/issues/61">https://github.com/openSUSE/qem-dashboard/issues/61</a></p>
<p>Originally brought up by coolo in<br>
<a href="https://suse.slack.com/archives/C02D16TCP99/p1638283633141300">https://suse.slack.com/archives/C02D16TCP99/p1638283633141300</a> </p>
<blockquote>
<p>I just noticed a rather alarming issue: <a href="http://dashboard.qam.suse.de/incident/20989">http://dashboard.qam.suse.de/incident/20989</a> talks about 43 passed, 1 failed jobs for the incident</p>
</blockquote>
<a name="Problems"></a>
<h2 >Problems<a href="#Problems" class="wiki-anchor">¶</a></h2>
<ul>
<li><a href="http://dashboard.qam.suse.de/incident/20639">http://dashboard.qam.suse.de/incident/20639</a> references "208 passed, 4 failed, 12 stopped" and a link to openQA results <a href="https://openqa.suse.de/tests/overview?build=%3A20639%3Aopensc">https://openqa.suse.de/tests/overview?build=%3A20639%3Aopensc</a> but the openQA test results only show 183 passed and 18 soft-failed
<ul>
<li>-> dashboard should not say "passed" when it means "passed+softfailed" but "ok", see <a href="https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Jobs/Constants.pm#L76=">https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Jobs/Constants.pm#L76=</a></li>
<li>-> Consider using time-fixed links, e.g. <a href="https://openqa.suse.de/tests/overview?build=%3A20639%3Aopensc&t=2022-04-01+08%3A53%3A19+%2B0000">https://openqa.suse.de/tests/overview?build=%3A20639%3Aopensc&t=2022-04-01+08%3A53%3A19+%2B0000</a></li>
<li>-> Ensure that the results are current and correspond to what openQA sees itself (numbers should match)</li>
<li>-> Exclude any results that are outside a "reasonable time range", e.g. <a href="http://dashboard.qam.suse.de/blocked">http://dashboard.qam.suse.de/blocked</a> for 20639 shows incident results from some months ago, build 2021…</li>
</ul></li>
</ul>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> It is possible to reuse incidents and qem-bot can still approve releated release requests</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read the qem-dashboard schema to understand where important settings are stored in <a href="https://github.com/openSUSE/qem-dashboard/">https://github.com/openSUSE/qem-dashboard/</a> , in particular <a href="https://github.com/openSUSE/qem-dashboard/blob/main/migrations/dashboard.sql">https://github.com/openSUSE/qem-dashboard/blob/main/migrations/dashboard.sql</a></li>
<li>Read the proper manual process as "Workaround" and for us to understand (further down)</li>
<li>Just delete all aggregate openQA data in qem-dashboard older than configurable, but default 90 days</li>
</ul>
<a name="Workarounds"></a>
<h2 >Workarounds<a href="#Workarounds" class="wiki-anchor">¶</a></h2>
<ul>
<li>Ask maintenance to create a new, fresh incident, e.g. by a comment in IBS</li>
<li>Detect invalid requests e.g. with outdates results and reject them</li>
<li>Manually delete</li>
</ul>
<p>Something along the lines of</p>
<pre><code>ssh root@qam2.suse.de
machinectl shell postgresql
sudo -u postgres psql dashboard_db
(wreak havok in here)
SELECT update_settings FROM openqa_jobs WHERE update_settings is not NULL AND updated < NOW() - INTERVAL X
(store update_settings)
DELETE FROM openqa_jobs WHERE update_settings is not NULL AND updated < NOW() - INTERVAL X
DELETE FROM update_openqa_settings WHERE id in `stored update_settings`
</code></pre> openQA Infrastructure - action #109301 (Rejected): openqaworker14 + openqaworker15 sporadically g...https://progress.opensuse.org/issues/1093012022-03-31T09:07:53Zosukup
<a name="OBSERVATION"></a>
<h2 >OBSERVATION<a href="#OBSERVATION" class="wiki-anchor">¶</a></h2>
<p>on reboot time to time this workers fails to correctly boot ending in emergency mode:</p>
<pre><code>bře 08 14:34:24 openqaworker14 kernel: Loading iSCSI transport class v2.0-870.
bře 08 14:34:24 openqaworker14 systemd[1]: Finished Create Volatile Files and Directories.
bře 08 14:34:24 openqaworker14 systemd[1]: Starting Security Auditing Service...
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1557]: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1557]: nvme0n1 259:0 0 3.5T 0 disk
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1557]: ├─nvme0n1p1 259:1 0 512M 0 part
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1557]: ├─nvme0n1p2 259:2 0 1T 0 part /
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1557]: └─nvme0n1p3 259:3 0 2.5T 0 part
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1557]: └─md127 9:127 0 2.5T 0 raid0
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1552]: Stopping current RAID "/dev/md/openqa"
bře 08 14:34:24 openqaworker14 systemd[1]: Finished Flush Journal to Persistent Storage.
bře 08 14:34:24 openqaworker14 kernel: i40iw_open: i40iw_open completed
bře 08 14:34:24 openqaworker14 systemd[1]: Created slice Slice /system/rdma-load-modules.
bře 08 14:34:24 openqaworker14 systemd[1]: Starting Load RDMA modules from /etc/rdma/modules/iwarp.conf...
bře 08 14:34:24 openqaworker14 systemd[1]: Starting Load RDMA modules from /etc/rdma/modules/rdma.conf...
bře 08 14:34:24 openqaworker14 kernel: ixgbe 0000:d8:00.1: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
bře 08 14:34:24 openqaworker14 systemd[1]: Finished Load RDMA modules from /etc/rdma/modules/iwarp.conf.
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1559]: mdadm: stopped /dev/md/openqa
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1552]: Creating RAID0 "/dev/md/openqa" on: /dev/nvme0n1p3
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1574]: mdadm: /dev/nvme0n1p3 appears to be part of a raid array:
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1574]: level=raid0 devices=1 ctime=Mon Mar 7 10:20:52 2022
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1574]: mdadm: unexpected failure opening /dev/md127
bře 08 14:34:24 openqaworker14 openqa-establish-nvme-setup[1552]: Unable to create RAID, mdadm returned with non-zero code
bře 08 14:34:24 openqaworker14 kernel: i40iw_open: i40iw_open completed
bře 08 14:34:24 openqaworker14 systemd[1]: openqa_nvme_format.service: Main process exited, code=exited, status=1/FAILURE
bře 08 14:34:24 openqaworker14 systemd[1]: openqa_nvme_format.service: Failed with result 'exit-code'.
bře 08 14:34:24 openqaworker14 systemd[1]: Failed to start Setup NVMe before mounting it.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for /var/lib/openqa.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for openQA Worker #1.
bře 08 14:34:24 openqaworker14 systemd[1]: openqa-worker-auto-restart@1.service: Job openqa-worker-auto-restart@1.service/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for var-lib-openqa-share.automount.
bře 08 14:34:24 openqaworker14 systemd[1]: var-lib-openqa-share.automount: Job var-lib-openqa-share.automount/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for openQA Worker #3.
bře 08 14:34:24 openqaworker14 systemd[1]: openqa-worker-auto-restart@3.service: Job openqa-worker-auto-restart@3.service/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for Prepare NVMe after mounting it.
bře 08 14:34:24 openqaworker14 systemd[1]: openqa_nvme_prepare.service: Job openqa_nvme_prepare.service/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for Local File Systems.
bře 08 14:34:24 openqaworker14 systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for openQA Worker #2.
bře 08 14:34:24 openqaworker14 systemd[1]: openqa-worker-auto-restart@2.service: Job openqa-worker-auto-restart@2.service/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: Dependency failed for openQA Worker #4.
bře 08 14:34:24 openqaworker14 systemd[1]: openqa-worker-auto-restart@4.service: Job openqa-worker-auto-restart@4.service/start failed with result 'dependency'.
bře 08 14:34:24 openqaworker14 systemd[1]: var-lib-openqa.mount: Job var-lib-openqa.mount/start failed with result 'dependency'.
</code></pre>
<p>Cause of problem is probably difference in hw configuration of this workers. Our standard workers have 1x HDD with OS and 1x name SSD with /dev/md/openQA. This workers have only one nvme SSD.<br>
Configured as:</p>
<pre><code>nvme0n1
├─nvme0n1p1 vfat FAT32 9AED-277B 506M 1% /boot/efi
├─nvme0n1p2 btrfs 5a405f4e-bd0c-46cb-a5ee-a0e976968be1 1016,5G 1% /
└─nvme0n1p3 linux_raid_member 1.2 openqaworker14:openqa 03972fdb-874d-cbec-4cb8-bca5412d90a2
└─md127 ext2 1.0 4c30279b-d757-4a97-b636-539b18bc9e22 2,3T 0% /var/lib/openqa
</code></pre> openQA Project - action #101779 (Resolved): osd deployment failed with non-zero status openqa-wor...https://progress.opensuse.org/issues/1017792021-11-01T08:52:29Zosukup
<p>zypper reports needing reboot so non zero status ( 10x are informal, safely can be considered as zero)</p>
<p><a href="https://gitlab.suse.de/openqa/osd-deployment/-/jobs/669420" class="external">https://gitlab.suse.de/openqa/osd-deployment/-/jobs/669420</a></p>
<p><a href="https://gitlab.suse.de/openqa/osd-deployment/-/blob/master/.gitlab-ci.yml#L188" class="external">https://gitlab.suse.de/openqa/osd-deployment/-/blob/master/.gitlab-ci.yml#L188</a></p>
QA - action #96968 (Resolved): qem-dashboard jobs api extension https://progress.opensuse.org/issues/969682021-08-16T10:45:29Zosukup
<p><a href="https://gitlab.suse.de/opensuse/qem-dashboard/-/issues/14" class="external">https://gitlab.suse.de/opensuse/qem-dashboard/-/issues/14</a></p>
<p>for approve part of bot, we need access to list of jobs for exact settings (incident_settings/updates_settings)</p>
<p>proposed api is described in gitlab issue</p>
openQA Infrastructure - action #96719 (Resolved): recover imagetester with broken filesystem/hard...https://progress.opensuse.org/issues/967192021-08-10T14:36:00Zosukup
<p>During work on <a href="https://progress.opensuse.org/issues/96311" class="external">https://progress.opensuse.org/issues/96311</a> , we found imagetester wasn't updated for 2 months</p>
<p>investigate why wasn't automatic transactional update working and update imagetester.</p>
<p>now blocked by <a href="https://infra.nue.suse.com/SelfService/Display.html?id=194271" class="external">https://infra.nue.suse.com/SelfService/Display.html?id=194271</a> , because it didn't survive reboot and this host hasn't any remote management interface</p>
openQA Tests - action #17082 (Resolved): test fails in yast2_snapperhttps://progress.opensuse.org/issues/170822017-02-15T12:08:10Zosukup
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-12-SP2-Desktop-DVD-Updates-x86_64-qam-allpatterns+sdk@64bit fails in<br>
<a href="https://openqa.suse.de/tests/777201/modules/yast2_snapper/steps/48" class="external">yast2_snapper</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/777201" class="external">20170215-2</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/776821" class="external">20170215-1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?machine=64bit&version=12-SP2&arch=x86_64&distri=sle&flavor=Desktop-DVD-Updates&test=qam-allpatterns%2Bsdk" class="external">latest</a></p>
<p>comments in snappshots are now more informative and part of snapper windows stays hidden --> send super_up and check in maximized window ?</p>
openQA Project - action #12924 (Resolved): Susscefull test with STORE_HDD_1 set as failedhttps://progress.opensuse.org/issues/129242016-07-28T16:27:23Zosukup
<p>see <a href="http://argus.suse.cz/tests/223">http://argus.suse.cz/tests/223</a></p>
<p>in autoinst-log: </p>
<p><code>16:00:29.9383 2425 QEMU: qemu: terminating on signal 15 from pid 2425<br>
16:00:29.9386 2425 sending magic and exit<br>
16:00:29.9387 2413 received magic close<br>
16:00:29.9618 2413 got sigchld<br>
16:00:29.9625 2413 preparing hdd 1 for upload as SLES-12-SP1-x86_64-kernel-BUILD-Pepper1.qcow2 in qcow2<br>
16:01:12.3712 2413 got sigchld<br>
16:01:12.3714 2413 sha1sum f56dd16162e3055c1af33d32e4513cc9273f896c *assets_public/SLES-12-SP1-x86_64-kernel-BUILD-Pepper1.qcow2<br>
EXIT 0<br>
16:01:12.3717 2413 awaiting death of commands process<br>
16:01:12.3716 2416 sysread failed: Connection reset by peer<br>
16:01:12.3835 2413 commands process exited: 2416</code></p>
<p>from journal :</p>
<p><code>čec 28 18:02:49 argus worker[1799]: Received command livelog_stop for job 223, but we do not have any assigned. Ignoring!<br>
čec 28 18:02:15 argus worker[1799]: setting job 223 to done<br>
čec 28 18:02:05 argus worker[1799]: cleaning up 00000223-qam-12-SP1-Server-DVD-Kernel-x86_64-BuildPepper1-sles-kernel-create-hdd@64bit-kernel...<br>
čec 28 18:02:05 argus worker[1799]: 0: 500 response: Internal Server Error at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/UserAgent.pm line 247.<br>
čec 28 18:02:05 argus openqa[1472]: [Thu Jul 28 18:02:05 2016] [1474:error] DBIx::Class::Row::get_column(): No such column 'child.result' on OpenQA::Schema::Result::JobDependencies at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1284<br>
čec 28 18:02:00 argus worker[1799]: 1: 500 response: Internal Server Error at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/UserAgent.pm line 247.<br>
čec 28 18:02:00 argus openqa[1472]: [Thu Jul 28 18:02:00 2016] [1477:error] DBIx::Class::Row::get_column(): No such column 'child.result' on OpenQA::Schema::Result::JobDependencies at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1284<br>
čec 28 18:01:55 argus worker[1799]: 2: 500 response: Internal Server Error at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/UserAgent.pm line 247.<br>
čec 28 18:01:55 argus openqa[1472]: [Thu Jul 28 18:01:55 2016] [1479:error] DBIx::Class::Row::get_column(): No such column 'child.result' on OpenQA::Schema::Result::JobDependencies at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1284<br>
čec 28 18:01:14 argus worker[1799]: waitpid returned error: No child processes<br>
čec 28 18:01:14 argus worker[1799]: killing 2413</code></p>
<p>sha1sum for created hdd image is correct</p>