coordination #48641
opencoordination #58184: [saga][epic][use case] full version control awareness within openQA
[epic] Trigger openQA tests in pull requests of any product github pull request
100%
Description
User Story¶
As a developer of any software on github I want to execute an openQA test on a production server based on a pull request of my software to not need a local openQA instance
Further details¶
okurz:
Had a nice discussion with trenn: https://github.com/cobbler/cobbler/pull/2024 is a nice example of what one would desire to do in openQA instead, e.g.
- pull request in github triggers openQA test
- openQA test executes a custom test defined in the github source repo of the product under test (cobbler in this case)
- openQA test only boots a VM image, e.g. Tumbleweed, and executes the custom module
- test result is fed back to the github PR
implementation suggestions:
- could be done with polling bot for now
- using latest os-autoinst-distri-opensuse plus – as custom assets currently do not work – a custom test module that same what we tried already for HPC gets the defined test script, could be anything downloadable and executable, and runs it
- use
SCHEDULE
parameter with e.g.SCHEDULE=tests/boot/boot_to_desktop,tests/run_custom
- github API access relying on test variables previously provided on trigger
Further ideas¶
With #128360 a webhook-based approach has been implemented. The following aspects of it could still be improved:
- Better scheduled product pages: The PR check "Details" link leads to the page of the related scheduled product. It would make sense to list all jobs belonging to that scheduled product there in a nice table so one can easily access the jobs. It would also be great if it would show the latest jobs in the restart/clone chain (and not just the initial jobs). Maybe we could also show the overall status of the scheduled product reusing the code we use to report it to the CI check. Note that this would also be an improvement in general as it helps not just for the CI use case but when dealing with scheduled products in general.
- Better error handling when reporting back: possibly add a retry, add an audit event in case something goes wrong
- Finer/better permissions: So far we require an API key/secret pair where the associated user is at least operator. That already includes many permissions. We could allow the creation of API key/secret pairs that fewer permissions.
- Have an extra token as secret for signing of the webhook payload: This extra token could be generated for the used API key/secret pair so the secret for signing would not just basically be the API user/credentials again.
- Allow to compute the HDD image path dynamically, e.g. by providing a regex which is matched against existing assets on the openQA instance. This would allow utilizing an existing image in a dynamic way (e.g. to use the image created by the installation job for the most latest Tumbleweed snapshot as the openQA-in-openQA test does).
- Add mechanism to avoid users from spamming jobs.
Updated by okurz over 5 years ago
- Copied from action #44327: Trigger tests based on git refspec/branch added
Updated by mkittler almost 5 years ago
- This use-case is also impaired by the inability to use custom needles (https://progress.opensuse.org/issues/56789 #56789).
- Besides, the implementation suggestion
use SCHEDULE parameter with e.g. SCHEDULE=tests/boot/boot_to_desktop,tests/run_custom
implies that a composability of test distributions would be required. This is of course possible by somehow embedding the base test distribution (e.g. os-autoinst-distri-opensuse) into the product repository (e.g. cobbler) adding own tests on top of it. But this sounds rather hacky and inconvenient to use. So having a light test distribution which can simply be referred to as "base test distribution" seems a desirable for this use-case.
These points are likely the tricky part. Triggering the test execution itself is likely not that hard. Instead of writing a "polling bot" I would add an API route in openQA which can be added as GitHub hook.
Updated by okurz over 4 years ago
mkittler wrote:
- This use-case is also impaired by the inability to use custom needles (https://progress.opensuse.org/issues/56789 #56789).
Maybe we can still consider any need for needles an independant requirement of the part about triggering test code which I find more important.
Regarding the "hacky approach" or "base test distribution" I think we have a ticket for having something like a "linux" middleware between os-autoinst and os-autoinst-distri-opensuse that can be extracted but for now and to have a proof-of-concept on which to improve upon I recommend the "hacky" approach
[…] I would add an API route in openQA which can be added as GitHub hook.
What would you add as an API route here?
Updated by okurz almost 4 years ago
- Related to coordination #77698: [epic] synchronous qemu based system level test in pull request CI runs, e.g. standalone isotovideo or openQA tests added
Updated by okurz almost 4 years ago
- Subject changed from Trigger openQA tests in pull requests of any product github pull request to [epic] Trigger openQA tests in pull requests of any product github pull request
- Status changed from New to Blocked
- Assignee set to okurz
waiting for #77698 first
Updated by okurz over 3 years ago
- Target version changed from Ready to future
Updated by szarate over 1 year ago
- Related to action #124173: [qe-core] Create status badges for verification runs added
Updated by mkittler over 1 year ago
- Description updated (diff)
I've extended the "Further ideas" section because we've noticed that the webhook-based approach is not yet applicable for repos like the openQA-in-openQA test.
Updated by okurz over 1 year ago
okurz wrote:
[…]
- Allow to compute the HDD image path dynamically, e.g. by providing a regex which is matched against existing assets on the openQA instance. This would allow utilizing an existing image in a dynamic way (e.g. to use the image created by the installation job for the most latest Tumbleweed snapshot as the openQA-in-openQA test does).
For now I don't think this is necessary as one can use a static link to an appliance, e.g. http://download.opensuse.org/tumbleweed/appliances/openSUSE-Tumbleweed-Minimal-VM.x86_64-kvm-and-xen.qcow2
Updated by livdywan over 1 year ago
okurz wrote:
okurz wrote:
[…]
- Allow to compute the HDD image path dynamically, e.g. by providing a regex which is matched against existing assets on the openQA instance. This would allow utilizing an existing image in a dynamic way (e.g. to use the image created by the installation job for the most latest Tumbleweed snapshot as the openQA-in-openQA test does).
For now I don't think this is necessary as one can use a static link to an appliance, e.g. http://download.opensuse.org/tumbleweed/appliances/openSUSE-Tumbleweed-Minimal-VM.x86_64-kvm-and-xen.qcow2
How? This is an external URL. You're quoting a use case for re-using the image generated by a job.
Updated by okurz over 1 year ago
cdywan wrote:
How? This is an external URL. You're quoting a use case for re-using the image generated by a job.
There was never a strong need to re-use an openQA generated image though.
Updated by okurz about 1 year ago
- Target version changed from Ready to Tools - Next
Updated by szarate 7 months ago
@okurz, I'm having a hard time trying to understand the next step for this story. What are the next three steps to move this forward? (regardless of when, but I'm looking for order of steps, represented in tickets)
Also, because I don't really want to reopen #150992:
Trying to understand the failure rate of #150992 and other consequences in the context of recent slack chat, I came to notice at least one 503 (bad gateway), which reminds me of the problems I already mentioned to you, while discussing #154798
That sleep 1, is needed, as otherwise the webUI will return 503's constantly
And before moving any further, somehow that has to be addressed, which brings me again to the question of how we track usage statistics and monitor the web UI health. iirc either mojolicious ui that was showing some stats, @kraih you recall maybe or how we solved the similar problem we were having years ago, I think it was by incrementing something with the webui workers/processes and so on?
https://github.com/os-autoinst/os-autoinst-distri-opensuse/actions/runs/7666441613/job/20894254516#step:5:143
Updated by okurz 7 months ago
szarate wrote in #note-25:
@okurz, I'm having a hard time trying to understand the next step for this story. What are the next three steps to move this forward? (regardless of when, but I'm looking for order of steps, represented in tickets)
The next steps are the subtasks in this ticket here, e.g. #130934 and #152939 currently in the backlog.
After those the other open subtasks or what comes up based on our experiences with those from production.
Also, because I don't really want to reopen #150992:
[...] I came to notice at least one 503 (bad gateway)
Yes, that's a known behavior. "Bad gateway" is a bit misleading but hard to avoid. In this context here it shouldn't ever be a problem as there is retrying and the final result is monitored for anyway.
which reminds me of the problems I already mentioned to you, while discussing #154798
Yes, I would like to solve such 503's as part of that
Updated by szarate 7 months ago
okurz wrote in #note-26:
szarate wrote in #note-25:
@okurz, I'm having a hard time trying to understand the next step for this story. What are the next three steps to move this forward? (regardless of when, but I'm looking for order of steps, represented in tickets)
The next steps are the subtasks in this ticket here, e.g. #130934 and #152939 currently in the backlog.
After those the other open subtasks or what comes up based on our experiences with those from production.Also, because I don't really want to reopen #150992:
[...] I came to notice at least one 503 (bad gateway)Yes, that's a known behavior. "Bad gateway" is a bit misleading but hard to avoid. In this context here it shouldn't ever be a problem as there is retrying and the final result is monitored for anyway.
which reminds me of the problems I already mentioned to you, while discussing #154798
Yes, I would like to solve such 503's as part of that
Much clear now <3 appreciate the clarity