Project

General

Profile

Actions

coordination #48641

open

coordination #58184: [saga][epic][use case] full version control awareness within openQA

[epic] Trigger openQA tests in pull requests of any product github pull request

Added by okurz about 5 years ago. Updated about 8 hours ago.

Status:
Blocked
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2020-11-15
Due date:
2024-04-11 (Due in 13 days)
% Done:

80%

Estimated time:
(Total: 0.00 h)

Description

User Story

As a developer of any software on github I want to execute an openQA test on a production server based on a pull request of my software to not need a local openQA instance

Further details

okurz:
Had a nice discussion with trenn: https://github.com/cobbler/cobbler/pull/2024 is a nice example of what one would desire to do in openQA instead, e.g.

  1. pull request in github triggers openQA test
  2. openQA test executes a custom test defined in the github source repo of the product under test (cobbler in this case)
  3. openQA test only boots a VM image, e.g. Tumbleweed, and executes the custom module
  4. test result is fed back to the github PR

implementation suggestions:

  1. could be done with polling bot for now
  2. using latest os-autoinst-distri-opensuse plus – as custom assets currently do not work – a custom test module that same what we tried already for HPC gets the defined test script, could be anything downloadable and executable, and runs it
  3. use SCHEDULE parameter with e.g. SCHEDULE=tests/boot/boot_to_desktop,tests/run_custom
  4. github API access relying on test variables previously provided on trigger

Further ideas

With #128360 a webhook-based approach has been implemented. The following aspects of it could still be improved:

  1. Better scheduled product pages: The PR check "Details" link leads to the page of the related scheduled product. It would make sense to list all jobs belonging to that scheduled product there in a nice table so one can easily access the jobs. It would also be great if it would show the latest jobs in the restart/clone chain (and not just the initial jobs). Maybe we could also show the overall status of the scheduled product reusing the code we use to report it to the CI check. Note that this would also be an improvement in general as it helps not just for the CI use case but when dealing with scheduled products in general.
  2. Better error handling when reporting back: possibly add a retry, add an audit event in case something goes wrong
  3. Finer/better permissions: So far we require an API key/secret pair where the associated user is at least operator. That already includes many permissions. We could allow the creation of API key/secret pairs that fewer permissions.
  4. Have an extra token as secret for signing of the webhook payload: This extra token could be generated for the used API key/secret pair so the secret for signing would not just basically be the API user/credentials again.
  5. Allow to compute the HDD image path dynamically, e.g. by providing a regex which is matched against existing assets on the openQA instance. This would allow utilizing an existing image in a dynamic way (e.g. to use the image created by the installation job for the most latest Tumbleweed snapshot as the openQA-in-openQA test does).
  6. Add mechanism to avoid users from spamming jobs.

Subtasks 23 (8 open15 closed)

coordination #77698: [epic] synchronous qemu based system level test in pull request CI runs, e.g. standalone isotovideo or openQA testsNew2020-11-15

Actions
action #77905: CI pipeline proof-of-concept running isotovideoResolvedokurz2020-11-15

Actions
action #86063: [epic] Add possibility to trigger openQA API calls, e.g. single "jobs", without the need of the client / over the webUI / with curlBlockedokurz2021-01-13

Actions
action #87698: openQA jobs can be triggered with single curl callsResolvedkraih2021-01-13

Actions
action #90788: openQA jobs with arbitrary parameters can be triggered over the webUI for authenticated users with right permissions (operator+)Workable

Actions
action #87695: Full openQA test development, maintenance and administration from browser without the need of a local terminal size:MWorkable

Actions
coordination #124466: [epic] Put open points from okurz's hackweek 22 project into proper ticketsResolvedmkittler2023-02-14

Actions
action #124502: [spike][timeboxed:20h] complete test definition from yaml schedule in git checked out test distributionResolvedmkittler2023-02-14

Actions
action #125720: [spike][timeboxed:20h] Add monitoring-support into openqa-cliResolvedmkittler

Actions
action #125723: Provide a ready-to-use container image or GitHub action repository to trigger/monitor openQA jobs as CI checks size:MResolvedmkittler2023-03-09

Actions
action #126950: [openQA-in-openQA] openQA tests in pull requests to github.com/os-autoinst/os-autoinst-distri-openQA/ size:MResolvedjbaier_cz2023-03-30

Actions
action #127949: [spike][timeboxed:20h] Research native GitHub for running openQA tests as CI checks size:MResolvedmkittler2023-04-19

Actions
action #128360: Supporting fork based development model size:MResolvedmkittler

Actions
action #129730: Adapt http://open.qa/docs/#_running_openqa_jobs_as_ci_checks for the use of github pull_request_target size:MResolvedmkittler

Actions
coordination #130850: [epic] Use openqa-clone-custom-git-refspec to parse github description+comments and trigger openQA tests as part of CIBlockedokurz2023-06-152024-04-11

Actions
action #130934: Trigger openQA tests mentioned in github description as part of CI size:MResolvedmkittler2023-06-15

Actions
action #130940: Trigger openQA tests mentioned in github comments as part of automatic testing as wellNewokurz2023-06-15

Actions
action #130943: Test parameterization for github description/comments mentioned openQA job clones as part of CI size:SFeedbackmkittler2023-06-152024-04-11

Actions
action #138203: [openQA-in-openQA] CI jobs show error but don't fail the CI job as they should *and* openqa_install+publish missing size:MResolvedjbaier_cz2023-10-18

Actions
action #150992: [timeboxed][spike solution:20h] openQA tests in pull requests to github.com/os-autoinst/os-autoinst-distri-opensuse/ size:MResolvedmkittler

Actions
action #152170: Run openQA tests in pull requests to github.com/os-autoinst/os-autoinst-distri-opensuse/ size:MResolvedokurz

Actions
action #152939: Find "last build" of a product over API size:MResolvedtinita2023-12-27

Actions
action #153460: schedule boot_to_desktop and the test module(s) changed if the change is on tests/ in os-autoinst-distri-opensuseNew2024-01-12

Actions

Related issues 2 (0 open2 closed)

Related to openQA Tests - action #124173: [qe-core] Create status badges for verification runsResolveddheidler2023-02-09

Actions
Copied from openQA Project - action #44327: Trigger tests based on git refspec/branchResolvedokurz2018-11-25

Actions
Actions #1

Updated by okurz about 5 years ago

  • Copied from action #44327: Trigger tests based on git refspec/branch added
Actions #2

Updated by okurz over 4 years ago

  • Parent task set to #58184
Actions #3

Updated by mkittler over 4 years ago

  • This use-case is also impaired by the inability to use custom needles (https://progress.opensuse.org/issues/56789 #56789).
  • Besides, the implementation suggestion use SCHEDULE parameter with e.g. SCHEDULE=tests/boot/boot_to_desktop,tests/run_custom implies that a composability of test distributions would be required. This is of course possible by somehow embedding the base test distribution (e.g. os-autoinst-distri-opensuse) into the product repository (e.g. cobbler) adding own tests on top of it. But this sounds rather hacky and inconvenient to use. So having a light test distribution which can simply be referred to as "base test distribution" seems a desirable for this use-case.

These points are likely the tricky part. Triggering the test execution itself is likely not that hard. Instead of writing a "polling bot" I would add an API route in openQA which can be added as GitHub hook.

Actions #4

Updated by okurz almost 4 years ago

mkittler wrote:

Maybe we can still consider any need for needles an independant requirement of the part about triggering test code which I find more important.

Regarding the "hacky approach" or "base test distribution" I think we have a ticket for having something like a "linux" middleware between os-autoinst and os-autoinst-distri-opensuse that can be extracted but for now and to have a proof-of-concept on which to improve upon I recommend the "hacky" approach

[…] I would add an API route in openQA which can be added as GitHub hook.

What would you add as an API route here?

Actions #5

Updated by okurz over 3 years ago

  • Target version set to Ready
Actions #7

Updated by okurz over 3 years ago

  • Related to coordination #77698: [epic] synchronous qemu based system level test in pull request CI runs, e.g. standalone isotovideo or openQA tests added
Actions #8

Updated by okurz over 3 years ago

  • Subject changed from Trigger openQA tests in pull requests of any product github pull request to [epic] Trigger openQA tests in pull requests of any product github pull request
  • Status changed from New to Blocked
  • Assignee set to okurz

waiting for #77698 first

Actions #9

Updated by okurz almost 3 years ago

  • Target version changed from Ready to future
Actions #10

Updated by szarate about 1 year ago

  • Related to action #124173: [qe-core] Create status badges for verification runs added
Actions #11

Updated by okurz 11 months ago

  • Target version changed from future to Ready
Actions #12

Updated by mkittler 10 months ago

  • Description updated (diff)
Actions #13

Updated by mkittler 10 months ago

  • Description updated (diff)

I've extended the "Further ideas" section because we've noticed that the webhook-based approach is not yet applicable for repos like the openQA-in-openQA test.

Actions #14

Updated by mkittler 10 months ago

  • Description updated (diff)
Actions #15

Updated by okurz 10 months ago

okurz wrote:

[…]

  1. Allow to compute the HDD image path dynamically, e.g. by providing a regex which is matched against existing assets on the openQA instance. This would allow utilizing an existing image in a dynamic way (e.g. to use the image created by the installation job for the most latest Tumbleweed snapshot as the openQA-in-openQA test does).

For now I don't think this is necessary as one can use a static link to an appliance, e.g. http://download.opensuse.org/tumbleweed/appliances/openSUSE-Tumbleweed-Minimal-VM.x86_64-kvm-and-xen.qcow2

Actions #16

Updated by livdywan 10 months ago

okurz wrote:

okurz wrote:

[…]

  1. Allow to compute the HDD image path dynamically, e.g. by providing a regex which is matched against existing assets on the openQA instance. This would allow utilizing an existing image in a dynamic way (e.g. to use the image created by the installation job for the most latest Tumbleweed snapshot as the openQA-in-openQA test does).

For now I don't think this is necessary as one can use a static link to an appliance, e.g. http://download.opensuse.org/tumbleweed/appliances/openSUSE-Tumbleweed-Minimal-VM.x86_64-kvm-and-xen.qcow2

How? This is an external URL. You're quoting a use case for re-using the image generated by a job.

Actions #17

Updated by okurz 10 months ago

cdywan wrote:

How? This is an external URL. You're quoting a use case for re-using the image generated by a job.

There was never a strong need to re-use an openQA generated image though.

Actions #18

Updated by okurz 7 months ago

  • Target version changed from Ready to Tools - Next
Actions #19

Updated by okurz 5 months ago

  • Tracker changed from action to coordination
Actions #20

Updated by okurz 5 months ago

  • Subtask #138203 added
Actions #21

Updated by okurz 4 months ago

  • Subtask #150992 added
Actions #22

Updated by okurz 4 months ago

  • Subtask #152170 added
Actions #23

Updated by okurz 3 months ago

  • Subtask #152939 added
Actions #24

Updated by okurz 3 months ago

  • Subtask #153460 added
Actions #25

Updated by szarate about 1 month ago

@okurz, I'm having a hard time trying to understand the next step for this story. What are the next three steps to move this forward? (regardless of when, but I'm looking for order of steps, represented in tickets)

Also, because I don't really want to reopen #150992:

Trying to understand the failure rate of #150992 and other consequences in the context of recent slack chat, I came to notice at least one 503 (bad gateway), which reminds me of the problems I already mentioned to you, while discussing #154798

That sleep 1, is needed, as otherwise the webUI will return 503's constantly

And before moving any further, somehow that has to be addressed, which brings me again to the question of how we track usage statistics and monitor the web UI health. iirc either mojolicious ui that was showing some stats, @kraih you recall maybe or how we solved the similar problem we were having years ago, I think it was by incrementing something with the webui workers/processes and so on?
https://github.com/os-autoinst/os-autoinst-distri-opensuse/actions/runs/7666441613/job/20894254516#step:5:143

Actions #26

Updated by okurz about 1 month ago

szarate wrote in #note-25:

@okurz, I'm having a hard time trying to understand the next step for this story. What are the next three steps to move this forward? (regardless of when, but I'm looking for order of steps, represented in tickets)

The next steps are the subtasks in this ticket here, e.g. #130934 and #152939 currently in the backlog.
After those the other open subtasks or what comes up based on our experiences with those from production.

Also, because I don't really want to reopen #150992:
[...] I came to notice at least one 503 (bad gateway)

Yes, that's a known behavior. "Bad gateway" is a bit misleading but hard to avoid. In this context here it shouldn't ever be a problem as there is retrying and the final result is monitored for anyway.

which reminds me of the problems I already mentioned to you, while discussing #154798

Yes, I would like to solve such 503's as part of that

Actions #27

Updated by szarate about 1 month ago

okurz wrote in #note-26:

szarate wrote in #note-25:

@okurz, I'm having a hard time trying to understand the next step for this story. What are the next three steps to move this forward? (regardless of when, but I'm looking for order of steps, represented in tickets)

The next steps are the subtasks in this ticket here, e.g. #130934 and #152939 currently in the backlog.
After those the other open subtasks or what comes up based on our experiences with those from production.

Also, because I don't really want to reopen #150992:
[...] I came to notice at least one 503 (bad gateway)

Yes, that's a known behavior. "Bad gateway" is a bit misleading but hard to avoid. In this context here it shouldn't ever be a problem as there is retrying and the final result is monitored for anyway.

which reminds me of the problems I already mentioned to you, while discussing #154798

Yes, I would like to solve such 503's as part of that
Much clear now <3 appreciate the clarity

Actions

Also available in: Atom PDF