openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842024-03-24T18:55:53ZopenSUSE Project Management Tool
Redmine QA - action #157819 (Resolved): Can't login to walter1 and walter2 offlinehttps://progress.opensuse.org/issues/1578192024-03-24T18:55:53Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<pre><code>$ ssh root@walter1.qe.nue2.suse.org
Password:
</code></pre><pre><code>$ ping walter2.qe.nue2.suse.org
PING walter2.qe.nue2.suse.org(2a07:de40:a102:5:10:168:192:2 (2a07:de40:a102:5:10:168:192:2)) 56 data bytes
From 2a07:de40:a100:1:ffff:ffff:ffff:ffff (2a07:de40:a100:1:ffff:ffff:ffff:ffff) icmp_seq=3 Destination unreachable: Address unreachable
$ ping -4 root@walter2.qe.nue2.suse.org
ping: root@walter2.qe.nue2.suse.org: Name or service not known
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> SSH login for QE tools members works for both walter1+2</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Try to reproduce yourself</li>
<li>Create SUSE IT ticket <a href="https://progress.opensuse.org/projects/qa/wiki/Tools#SUSE-IT-ticket-handling" class="external">https://progress.opensuse.org/projects/qa/wiki/Tools#SUSE-IT-ticket-handling</a></li>
</ul>
openQA Infrastructure - action #156514 (Resolved): Cron <root@openqa-service> (date; fetch_openqa...https://progress.opensuse.org/issues/1565142024-03-04T07:37:00Zlivdywanliv.dywan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Cron <a href="mailto:root@openqa-service">root@openqa-service</a> (date; fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log at Mon, 4 Mar 2024 00:20:40 +0000 (UTC):</p>
<pre><code>Exception occured while fetching bsc#1158056
Traceback (most recent call last):
File "/usr/bin/fetch_openqa_bugs", line 62, in <module>
raise e
File "/usr/bin/fetch_openqa_bugs", line 48, in <module>
issue = issue_fetcher.get_issue(bugid)
File "/usr/lib/python3.6/site-packages/openqa_bugfetcher/issues/__init__.py", line 88, in get_issue
return self.prefix_table[prefix](self.conf, bugid)
File "/usr/lib/python3.6/site-packages/openqa_bugfetcher/issues/__init__.py", line 24, in __init__
self.fetch(conf)
File "/usr/lib/python3.6/site-packages/openqa_bugfetcher/issues/bugzilla_issue.py", line 28, in fetch
data = req.json()
File "/usr/lib/python3.6/site-packages/requests/models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib64/python3.6/site-packages/simplejson/__init__.py", line 525, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python3.6/site-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/lib64/python3.6/site-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
</code></pre>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Investigate what is causing the JSON error - likely an error string was returned instead of JSON</li>
<li>Handle the exception in bugzilla_issue.py and print a meaningful error</li>
<li>Investigate why 3 seemingly identical errors are sent in separate emails around the same time</li>
</ul>
openQA Infrastructure - action #156301 (Resolved): [bot-ng] Pipeline failed / KeyError: 'priority...https://progress.opensuse.org/issues/1563012024-02-29T08:54:46Zlivdywanliv.dywan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/2327183" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/2327183</a></p>
<pre><code>++ retry -r 30 -e -- ./qem-bot/bot-ng.py -c /etc/openqabot --token [MASKED] incidents-run
[...]
KeyError: 'priority'
Retrying up to 19 more times after sleeping 6144s …
2024-02-29 06:28:46 INFO Bot schedule starts now
Traceback (most recent call last):
File "./qem-bot/bot-ng.py", line 7, in <module>
main()
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/main.py", line 32, in main
sys.exit(cfg.func(cfg))
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/args.py", line 24, in do_incident_schedule
bot = OpenQABot(args)
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/openqabot.py", line 24, in __init__
self.incidents = get_incidents(self.token)
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/loader/qem.py", line 41, in get_incidents
xs.append(Incident(i))
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/types/incident.py", line 23, in __init__
self.priority = incident["priority"]
KeyError: 'priority'
Retrying up to 18 more times after sleeping 12288s …
ERROR: Job failed: execution took longer than 4h0m0s seconds
</code></pre>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>DONE</strong> Restart pipelines</li>
<li>Investigate if there is new data the bot is not handling correctly</li>
<li>Don't provoke timeouts with retrying on reproducible errors</li>
<li>Look into unit test coverage</li>
</ul>
openQA Project - action #156052 (Resolved): [alert] Scripts CI pipeline failing after logging mu...https://progress.opensuse.org/issues/1560522024-02-26T10:26:59Zlivdywanliv.dywan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://gitlab.suse.de/openqa/scripts-ci/-/jobs/2315561" class="external">https://gitlab.suse.de/openqa/scripts-ci/-/jobs/2315561</a></p>
<pre><code>2 jobs have been created:
- http://openqa.suse.de/tests/13603796
- http://openqa.suse.de/tests/13603797
{"blocked_by_id":null,"id":13603796,"result":"none","state":"scheduled"}
Job state of job ID 13603796: scheduled, waiting …
{"blocked_by_id":null,"id":13603796,"result":"none","state":"running"}
Job state of job ID 13603796: running, waiting …
</code></pre>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Investigate what is causing the pipeline to fail
<ul>
<li>The pipeline fails.</li>
<li>The two created jobs failed.</li>
<li>There is a lot of log messages mentioning "waiting" which is not shown to be successful or unsuccessful.</li>
</ul></li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<p>Active pipelines on <a href="https://gitlab.suse.de/openqa/scripts-ci/-/pipeline_schedules" class="external">https://gitlab.suse.de/openqa/scripts-ci/-/pipeline_schedules</a> again</p>
openQA Infrastructure - action #155689 (Resolved): bot-ng pipelines fails to schedule incidentshttps://progress.opensuse.org/issues/1556892024-02-20T11:05:17Znicksingernsinger@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Since Feb 19, 2024 6:03pm GMT+0100 our pipelines in bot-ng fail at the step "schedule incidents": <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs</a> e.g. <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/2292982" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/2292982</a> which was the first I could find</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1</strong>: Pipelines do work again (complete the job "schedule incidents")</li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<ul>
<li>activate pipelines <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules</a></li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Check if something changed</li>
<li>Read the logs and try to understand if the issues are the same and if/how we can fix them</li>
</ul>
QA - action #155629 (Resolved): [spike][timeboxed:6h][qem-dashboard] Order blocked incidents by p...https://progress.opensuse.org/issues/1556292024-02-19T11:12:25Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p><a href="https://smelt.suse.de/overview/#testing" class="external">https://smelt.suse.de/overview/#testing</a> shows incidents that are in testing with much more details about incidents than <a href="http://dashboard.qam.suse.de/blocked" class="external">http://dashboard.qam.suse.de/blocked</a> does. As apparently openQA test reviewers are not currently able to review all blocking test failures in time maintenance coordinators asked to better focus on incidents by priority. For this to select the higher prio incidents first the entries on <a href="http://dashboard.qam.suse.de/blocked" class="external">http://dashboard.qam.suse.de/blocked</a> should reflect the incident priority, e.g. order by priority or show the priority value.</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Proof-of-concept for ordering <a href="http://dashboard.qam.suse.de/blocked" class="external">http://dashboard.qam.suse.de/blocked</a> rows by incident priority</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Example of requesting incident priority for a single incident: <a href="https://smelt.suse.de/graphql/#query=%7B%0A%20%20incidents(incidentId%3A%2032579)%20%7B%0A%20%20%20%20edges%20%7B%0A%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20incidentpackagesSet%20%7B%0A%20%20%20%20%20%20%20%20%20%20edges%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20package%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20name%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20priority%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D%0A" class="external">https://smelt.suse.de/graphql/#query=%7B%0A%20%20incidents(incidentId%3A%2032579)%20%7B%0A%20%20%20%20edges%20%7B%0A%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20incidentpackagesSet%20%7B%0A%20%20%20%20%20%20%20%20%20%20edges%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20package%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20name%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20node%20%7B%0A%20%20%20%20%20%20%20%20priority%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D%0A</a></li>
<li>Extend qem-bot and/or qem-dashboard as necessary to have incident priority available</li>
<li>Lookup the most recent change handling embargoed updates ("embargoed" boolean value) for an example of how to add the new data from <a href="https://github.com/openSUSE/qem-dashboard/issues?q=is%3Aclosed+-label%3Adependencies+" class="external">https://github.com/openSUSE/qem-dashboard/issues?q=is%3Aclosed+-label%3Adependencies+</a> and correspondingly from qem-bot (maybe <a href="https://github.com/openSUSE/qem-bot/pull/128" class="external">https://github.com/openSUSE/qem-bot/pull/128</a> is relevant)</li>
</ul>
openQA Infrastructure - action #155080 (Resolved): jenkins is no longer producing GNOME:Next tes...https://progress.opensuse.org/issues/1550802024-02-07T13:03:45Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>From <a href="https://suse.slack.com/archives/C02CANHLANP/p1707310927769339" class="external">https://suse.slack.com/archives/C02CANHLANP/p1707310927769339</a></p>
<blockquote>
<p>(Dominique Leuenberger) seems jenkins is no longer producing GNOME:Next test runs: <a href="http://jenkins.qa.suse.de/job/gnome_next-openqa/8670/console" class="external">http://jenkins.qa.suse.de/job/gnome_next-openqa/8670/console</a></p>
</blockquote>
<pre><code>Caused: java.io.IOException: Cannot run program "/bin/sh" (in directory "/var/lib/jenkins/workspace/gnome_next-openqa"): error=0, Failed to exec spawn helper: pid: 2883, signal: 11
</code></pre> openQA Infrastructure - action #154549 (Resolved): Certain queries on poo are not accessiblehttps://progress.opensuse.org/issues/1545492024-01-30T12:44:44Zlivdywanliv.dywan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Queries like <a href="https://progress.opensuse.org/issues?query_id=754" class="external">non-infra tasks</a>, <a href="https://progress.opensuse.org/issues?query_id=757" class="external">infra tickets</a> and <a href="https://progress.opensuse.org/issues?query_id=717" class="external">non-estimated tickets</a> display an error:</p>
<pre><code>An error occurred while executing the query and has been logged. Please report this error to your Redmine administrator.
</code></pre>
<a name="Work-around"></a>
<h2 >Work-around<a href="#Work-around" class="wiki-anchor">¶</a></h2>
<ul>
<li>Use other backlog queries that don't rely on tags, in particular non-infra queries</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li><em>DONE</em> Re-open <a class="issue tracker-10 status-3 priority-4 priority-default closed" title="tickets: Update to Redmine 5 (Resolved)" href="https://progress.opensuse.org/issues/133532#note-757291">#133532#note-757291</a> and follow up on the most recent changes</li>
</ul>
openQA Infrastructure - action #154546 (Resolved): Cron fetch_openqa_bugs refused or timed out tr...https://progress.opensuse.org/issues/1545462024-01-30T12:26:07Zlivdywanliv.dywan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<pre><code>[...]
socket.timeout: timed out
During handling of the above exception, another exception occurred:
[...]
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7fdf94b4b4a8>, 'Connection to progress.opensuse.org timed out. (connect timeout=60)')
During handling of the above exception, another exception occurred:
[...]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='progress.opensuse.org', port=443): Max retries exceeded with url: /issues/31576.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fdf94b4b4a8>, 'Connection to progress.opensuse.org timed out. (connect timeout=60)'))
During handling of the above exception, another exception occurred:
[...]
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='progress.opensuse.org', port=443): Max retries exceeded with url: /issues/31576.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fdf94b4b4a8>, 'Connection to progress.opensuse.org timed out. (connect timeout=60)'))
</code></pre>
<p>or</p>
<pre><code>[...]
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
[...]
File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 172, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f6c6acae588>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
[...]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='progress.opensuse.org', port=443): Max retries exceeded with url: /issues/150923.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f6c6acae588>: Failed to establish a new connection: [Errno 111] Connection refused',))
During handling of the above exception, another exception occurred:
[...]
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='progress.opensuse.org', port=443): Max retries exceeded with url: /issues/150923.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f6c6acae588>: Failed to establish a new connection: [Errno 111] Connection refused',))
</code></pre>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Add (more) retries, or consider exponential backoff</li>
<li>Ignore errors in the context of bug fetching - they are not typially actionable anyway</li>
<li>Come up with another way to alert us of on-going lack of updates of the bugs database</li>
</ul>
QA - action #154498 (Resolved): [spike][timeboxed:20h][integration] Approve/reject SLE maintenanc...https://progress.opensuse.org/issues/1544982024-01-29T17:33:42Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>One of the most important responsibilities within SLE maintenance testing is to approve/reject SLE maintenance release requests based on openQA test results. So far <a href="https://github.com/openSUSE/qem-bot" class="external">qem-bot</a> is sufficient to schedule openQA tests but merely does a mediocre job of reporting back results as test results are asynchronously polled based on a periodic schedule <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules</a> causing unnecessary delays, inefficient polling, using outdated results <a class="issue tracker-4 status-4 priority-4 priority-default child" title="action: Use live openQA test results instead of inconsistent qem-dashboard database in qem-bot approver (Feedback)" href="https://progress.opensuse.org/issues/122311">#122311</a> and not even reporting back on blocking test failures <a class="issue tracker-6 status-1 priority-4 priority-default child parent" title="coordination: [epic] enable qem-bot comments on IBS (was: enable qa-maintenance/openQABot comments on smelt again) (New)" href="https://progress.opensuse.org/issues/97121">#97121</a>. Let's use a proper architecture with efficient event based triggers providing relevant information back to release requests on IBS using core openQA features rather than too much custom lacking downstream tooling: Develop a proof-of-concept of listening to yet-to-be designed "openQA product build testing finished" AMQP events and approve/reject the according release request.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Research how common OBS checks are implemented, e.g. openQA staging test integration, legalreview, installcheck, etc. For this see <a href="https://github.com/openSUSE/opensuse-release-tools" class="external">https://github.com/openSUSE/opensuse-release-tools</a>
<ul>
<li>Read <a href="https://github.com/openSUSE/openSUSE-release-tools/blob/master/gocd/rabbit-openqa.py" class="external">https://github.com/openSUSE/openSUSE-release-tools/blob/master/gocd/rabbit-openqa.py</a></li>
<li>Read <a href="https://github.com/openSUSE/openSUSE-release-tools/blob/master/gocd/rabbit-repoid.py" class="external">https://github.com/openSUSE/openSUSE-release-tools/blob/master/gocd/rabbit-repoid.py</a></li>
</ul></li>
<li>Follow <a class="issue tracker-4 status-3 priority-3 priority-lowest closed child" title="action: Find "last build" of a product over API size:M (Resolved)" href="https://progress.opensuse.org/issues/152939">#152939</a> and add publishing for an AMQP event for when incident "foo" finishes testing in openQA. For finding all tests related to incident "foo" see <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Provide API to get job results for a particular incident, similar to what dashboard/qem-bot does ... (Resolved)" href="https://progress.opensuse.org/issues/117655">#117655</a></li>
<li>Integrate both of the above either in a new standalone application or hack into <a href="https://github.com/openSUSE/qem-bot" class="external">https://github.com/openSUSE/qem-bot</a> – as part of a spike solution so do not be afraid to break any other use case – to approve/"reject" SLE maintenance release requests. If "reject" seems to be too severe then provide only "informational" feedback, e.g. as IBS comment or checker result.</li>
<li>Optionally consider to implement this as a openQA plugin, maybe that is simpler for some cases</li>
</ul>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Also related to <a class="issue tracker-4 status-4 priority-4 priority-default child" title="action: Use live openQA test results instead of inconsistent qem-dashboard database in qem-bot approver (Feedback)" href="https://progress.opensuse.org/issues/122311">#122311</a>, <a class="issue tracker-6 status-1 priority-3 priority-lowest" title="coordination: [saga][epic] Re-combined Maintenance QA tooling covering both SLE+openSUSE (New)" href="https://progress.opensuse.org/issues/123088">#123088</a>, <a class="issue tracker-6 status-1 priority-4 priority-default child parent" title="coordination: [epic] enable qem-bot comments on IBS (was: enable qa-maintenance/openQABot comments on smelt again) (New)" href="https://progress.opensuse.org/issues/97121">#97121</a>, <a class="issue tracker-6 status-1 priority-4 priority-default overdue parent behind-schedule" title="coordination: [saga][epic] Future improvements for SUSE Maintenance QA workflows with fully automated testing, ... (New)" href="https://progress.opensuse.org/issues/99303">#99303</a>, <a class="issue tracker-4 status-3 priority-3 priority-lowest closed child" title="action: Find "last build" of a product over API size:M (Resolved)" href="https://progress.opensuse.org/issues/152939">#152939</a>, <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: [timeboxed:6h][spike solution] a single command line or openQA webUI search view to show all test... (Resolved)" href="https://progress.opensuse.org/issues/131279">#131279</a>, <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Provide API to get job results for a particular incident, similar to what dashboard/qem-bot does ... (Resolved)" href="https://progress.opensuse.org/issues/117655">#117655</a></p>
<a name="Out-of-scope"></a>
<h2 >Out of scope<a href="#Out-of-scope" class="wiki-anchor">¶</a></h2>
<ul>
<li>Where to run persistently</li>
</ul>
openQA Infrastructure - action #153958 (Resolved): [alert] s390zl12: Memory usage alert Generic m...https://progress.opensuse.org/issues/1539582024-01-19T11:57:59Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<pre><code>Date: Fri, 19 Jan 2024 11:55:37 +0100
1 firing alert instance
[IMAGE]
GROUPED BY
hostname=s390zl12
1 firing instances
Firing [stats.openqa-monitor.qa.suse.de]
s390zl12: Memory usage alert
View alert [stats.openqa-monitor.qa.suse.de]
Values
A0=0.06117900738663373
Labels
alertname
s390zl12: Memory usage alert
grafana_folder
Generic
hostname
s390zl12
rule_uid
memory_usage_alert_s390zl12
</code></pre>
<p><a href="http://stats.openqa-monitor.qa.suse.de/alerting/grafana/memory_usage_alert_s390zl12/view?orgId=1" class="external">http://stats.openqa-monitor.qa.suse.de/alerting/grafana/memory_usage_alert_s390zl12/view?orgId=1</a></p>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<p>Remove silence "alertname=s390zl12: Memory usage alert" from <a href="https://stats.openqa-monitor.qa.suse.de/alerting/silences" class="external">https://stats.openqa-monitor.qa.suse.de/alerting/silences</a></p>
QA - action #153937 (Resolved): [tools] Make the end of each meeting explicit size:Shttps://progress.opensuse.org/issues/1539372024-01-19T10:54:34Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Based on retro 2024-01-19. okurz: I found multiple of our meetings a bit frustrating not receiving enough feedback also because many do not have their camera enabled. I would really like if people either more actively participate.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> One mention on our team wiki about best practices</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Make the end of each meeting explicit e.g. "We are done with the call", use sound effects</li>
<li>Encourage people to leave or agree to wrap up early if participants are struggling to contribute</li>
</ul>
openQA Infrastructure - action #151696 (Resolved): Evaluate use of https://itpe.io.suse.de/open-p...https://progress.opensuse.org/issues/1516962023-11-29T14:46:26Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>We are often asked to provide some VM for developers for personal development environment so we need to maintain hypervisor hosts which so far we do rather manually. More modern tooling and cloud solutions should be able to do a much better job at that.<br>
Announcement from today that OpenPlatform is ready for everyone in SUSE, newsletter <a href="https://itpe.io.suse.de/open-platform/docs/news/this-sprint-in-openplatform-9/" class="external">https://itpe.io.suse.de/open-platform/docs/news/this-sprint-in-openplatform-9/</a><br>
(I saw that first mentioned in <a href="https://suse.slack.com/archives/C029APBKLGK/p1701266764648529?thread_ts=1701266303.318549&cid=C029APBKLGK" class="external">https://suse.slack.com/archives/C029APBKLGK/p1701266764648529?thread_ts=1701266303.318549&cid=C029APBKLGK</a>)</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> It is known if this new platform can be used in any form for individual openQA deployments or personal VMs for us or our SUSE internal users</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read <a href="https://itpe.io.suse.de/open-platform/docs/" class="external">https://itpe.io.suse.de/open-platform/docs/</a> , create a simple virtual machine, demonstrate it to the team</li>
<li>If it works out fine include instructions in a corresponding wiki place e.g. on <a href="https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs/QA-SLE_cluster" class="external">https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs/QA-SLE_cluster</a> so that our fellow SUSE colleagues know how to use it to create personal VMs rather than relying on our limited hosts</li>
<li>For further support join Slack channel <a href="https://suse.slack.com/archives/C04S88VCHS7" class="external">#proj-it-open-platform</a></li>
</ul>
QA - action #139199 (Resolved): Ensure OSD openQA PowerPC machine redcurrant is operational from ...https://progress.opensuse.org/issues/1391992023-11-08T14:03:39Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Most PowerPC machines are being setup in PRG2 within <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> and are at least discoverable from HMC. Now we can setup redcurrant as production openQA PowerVM worker in OSD again.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> redcurrant openQA instances as referenced in <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls</a> are able to pass openQA jobs after the move to PRG2</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> about the generic setup and in particular the HMC</li>
<li>See current infrastructure management network entry <a href="https://racktables.nue.suse.com/index.php?page=object&object_id=11354" class="external">https://racktables.nue.suse.com/index.php?page=object&object_id=11354</a> and child objects about the machine "redcurrant"</li>
<li>Ensure we have access to redcurrant manually as well as with verification openQA jobs, both for osd</li>
<li>Update the infrastructure management network entry, e.g. new FQDN</li>
<li>Inform users about the result</li>
<li>Crosscheck according alert silences on <a href="https://monitor.qa.suse.de/alerting/silences" class="external">https://monitor.qa.suse.de/alerting/silences</a></li>
</ul>
QA - action #139112 (Resolved): Ensure OSD openQA PowerPC machine grenache is operational from PRG2https://progress.opensuse.org/issues/1391122023-11-04T12:46:49Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Most PowerPC machines are being setup in PRG2 within <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> and most machines could be discovered from the HMC but apparently not grenache.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> grenache openQA instances as referenced in <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls</a> are able to pass openQA jobs after the move to PRG2</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> about the generic setup and in particular the HMC</li>
<li>See current infrastructure management network entry <a href="https://racktables.nue.suse.com/index.php?page=object&object_id=3120" class="external">https://racktables.nue.suse.com/index.php?page=object&object_id=3120</a> about the machine "grenache"</li>
<li>Ensure we have access to grenache manually as well as with verification openQA jobs, both for osd</li>
<li>Update the infrastructure management network entry <a href="https://racktables.nue.suse.com/index.php?page=object&object_id=3120" class="external">https://racktables.nue.suse.com/index.php?page=object&object_id=3120</a> accordingly, e.g. new FQDN</li>
<li>Inform users about the result</li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<ul>
<li>Add back salt key with <code>salt-key -y -a grenache-1.qa.suse.de</code></li>
</ul>