Project

General

Profile

Actions

action #157741

open

coordination #99303: [saga][epic] Future improvements for SUSE Maintenance QA workflows with fully automated testing, approval and release

coordination #97121: [epic] enable qem-bot comments on IBS (was: enable qa-maintenance/openQABot comments on smelt again)

Approve/reject SLE maintenance release requests on IBS synchronously listening to AMQP events when testing for one release request as "openQA product build" is finished size:M

Added by okurz about 1 month ago. Updated 3 days ago.

Status:
Blocked
Priority:
Normal
Assignee:
Target version:
Start date:
2024-03-22
Due date:
2024-04-30 (Due in 3 days)
% Done:

0%

Estimated time:

Description

Motivation

One of the most important responsibilities within SLE maintenance testing is to approve/reject SLE maintenance release requests based on openQA test results. So far qem-bot is sufficient to schedule openQA tests but merely does a mediocre job of reporting back results as test results are asynchronously polled based on a periodic schedule https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules causing unnecessary delays, inefficient polling, using outdated results #122311 and not even reporting back on blocking test failures #97121. Let's use a proper architecture with efficient event based triggers providing relevant information back to release requests on IBS using core openQA features rather than too much custom lacking downstream tooling: After the PoC in #154498-14 we should fully implement that to approve/reject the according release request synchronously after AMQP event listening.

Acceptance criteria

  • AC1: something synchronously approves based on AMQP events

Suggestions

  • Follow-on with the PoC of #154498-14
  • Setup qem-bot or an alternative on existing or new server but make access to the logs
  • Add it as part of qem-dashbaord which already has AMQP support
  • Ensure that qem-bot runs near-continuous to be able to listen to all AMQP events accordingly, maybe back-to-back gitlab CI jobs with limits to prevent parallel execution which we already have?

Further details

Also related to #122311, #123088, #97121, #99303, #152939, #131279, #117655


Related issues 1 (0 open1 closed)

Copied from QA - action #154498: [spike][timeboxed:20h][integration] Approve/reject SLE maintenance release requests on IBS synchronously listening to AMQP events when testing for one release request as "openQA product build" is finished size:MResolvedjbaier_cz

Actions
Actions #1

Updated by okurz about 1 month ago

  • Copied from action #154498: [spike][timeboxed:20h][integration] Approve/reject SLE maintenance release requests on IBS synchronously listening to AMQP events when testing for one release request as "openQA product build" is finished size:M added
Actions #2

Updated by szarate about 1 month ago

Two questions I have: does a build also consider aggregates?

consider i.e Wicked: https://openqa.suse.de/tests/overview?distri=sle&&build=%3A32459%3Awicked&&build=:32458:wicked&build=:32460:wicked

Where this search is only showing single incidents, but doesn't show aggregate updates :D

Actions #3

Updated by okurz about 1 month ago

  • Target version changed from Tools - Next to Ready
Actions #4

Updated by okurz about 1 month ago

szarate wrote in #note-2:

Two questions I have: does a build also consider aggregates?

Yes.

What's the second question?

Actions #5

Updated by okurz about 1 month ago

  • Description updated (diff)
Actions #6

Updated by okurz 30 days ago

  • Subject changed from Approve/reject SLE maintenance release requests on IBS synchronously listening to AMQP events when testing for one release request as "openQA product build" is finished to Approve/reject SLE maintenance release requests on IBS synchronously listening to AMQP events when testing for one release request as "openQA product build" is finished size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #7

Updated by mkittler 12 days ago

  • Assignee set to mkittler
Actions #8

Updated by mkittler 12 days ago

  • Status changed from Workable to In Progress
Actions #9

Updated by openqa_review 11 days ago

  • Due date set to 2024-04-30

Setting due date based on mean cycle time of SUSE QE Tools

Actions #11

Updated by okurz 11 days ago

https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/2502425#L57

ModuleNotFoundError: No module named 'pika'

Actions #12

Updated by mkittler 11 days ago · Edited

The PR was merged and I configured the pipeline under https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedules.

I created https://sd.suse.com/servicedesk/customer/portal/1/SD-154403 to allow the traffic because the pipeline currently runs into a connection error.

Maybe we also still need to take care that the TLS certificate is available within the container (like what was done for #158907). The TLS certificates are already installed in the container (see https://build.suse.de/projects/QA:Maintenance/packages/openSUSE-Leap-Container/files/Dockerfile?expand=1).

Actions #13

Updated by mkittler 10 days ago

  • Status changed from In Progress to Blocked
Actions #14

Updated by mkittler 9 days ago

  • Status changed from Blocked to In Progress

In https://sd.suse.com/servicedesk/customer/portal/1/SD-154403 I was asked for the approval of the buildops team as it is the owner of amqps://rabbit.suse.de. They were not happy with us "abusing shared gitlab resources for this" so I suppose we better not go down that road. I'll setup the daemon on qam2.qe.prg2.suse.org instead. I suppose the only real disadvantage is that the AMQP "job" won't show up alongside the others on GitLab.

Actions #16

Updated by livdywan 8 days ago

Failed with pika.exceptions.AMQPConnectionError now, see https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/2512747

Actions #17

Updated by mkittler 5 days ago

  • Status changed from In Progress to Blocked

I just tried it again to see whether DNS has changed now but it still fails.

I also stopped qem-bot-amqp-watcher.service on qam2.qe.prg2.suse.org again as we're going for openplatform. If that turns out working I'll completely remove the service from qam2.qe.prg2.suse.org.

For now I keep this blocked on #156214.

Actions #19

Updated by mkittler 5 days ago · Edited

Once we have access we'd probably need build an RPM package for bot-ng and a container image installing it (according to https://itpe.io.suse.de/open-platform/docs/docs/getting_started/quickstart/#build-rpm-packages-and-container-images). We could maybe also skip the packaging step and add clone the Git repo directly when building the container. That might simplify things and we don't need to build anything here anyway.


By the way, I tried to improve the error handling of the AMQP code so we get more than just the exception type AMQPConnectionError: https://github.com/Martchus/qem-bot/pull/new/amqp-2
This didn't work, though. It looks like the error message is actually shown also without such a change, e.g.:

…
  File "/usr/lib64/python3.11/socket.py", line 962, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

However, in case of a connection error (and not a DNS error) there simply seems to be no error message available because even with my change all I get is:

./bot-ng.py --configs ../metadata -t 1234 --dry amqp --url amqp://10.145.56.20
2024-04-22 17:33:31 ERROR    Establishing AMQP connection to 'amqp://10.145.56.20': 

So this change makes things even worse as we now don't even know that it is an AMQPConnectionError. Considering https://pika.readthedocs.io/en/stable/modules/exceptions.html#pika.exceptions.AMQPConnectionError the error class AMQPConnectionError is probably the best we can get in certain cases.

Actions #20

Updated by mkittler 3 days ago

Of course we could also just use https://build.suse.de/projects/QA:Maintenance/packages/openSUSE-Leap-Container/files/Dockerfile again and to the checkout manually like in https://gitlab.suse.de/qa-maintenance/bot-ng/-/blob/master/.gitlab-ci.yml#L47.

Otherwise I suppose https://build.suse.de/project/show/QA:Maintenance would be the right place to add a new container (based on the existing openSUSE-Leap-Container in the same project).

Actions #21

Updated by okurz 3 days ago

mkittler wrote in #note-20:

Of course we could also just use https://build.suse.de/projects/QA:Maintenance/packages/openSUSE-Leap-Container/files/Dockerfile again and to the checkout manually like in https://gitlab.suse.de/qa-maintenance/bot-ng/-/blob/master/.gitlab-ci.yml#L47.

Otherwise I suppose https://build.suse.de/project/show/QA:Maintenance would be the right place to add a new container (based on the existing openSUSE-Leap-Container in the same project).

I suggest to not use IBS unless we have to. Shouldn't be too hard to create our own variant in OBS.

Actions

Also available in: Atom PDF