action #15844
closed
coordination #13812: [epic][dashboard] openQA Dashboard ideas
[tools]finish at least one build per day
Added by okurz over 7 years ago.
Updated about 7 years ago.
Category:
Feature requests
Description
User story¶
As a release manager of a product with many builds per day I want to have at least one build fully finished and tested so that late jobs are not skipped and I get a full picture of product quality
acceptance criteria¶
- AC1:
- Given a job group with only builds older than 1d (e.g. 2d)
- When two new builds are triggered consecutively in short time
- Then the first build is not obsoleted by the later one
- AC2: (regression test)
- Given a job group with a recent build (e.g. from the same day)
- When many new builds are triggered consecutively in short time
- Then the first build is obsoleted by the later one jobs of older builds are obsoleted after a certain limit of tests are scheduled
tasks¶
I can already think of two ways how to do it:
- From the sync and trigger side, on a new build: If there are no completed builds, i.e. with no skipped jobs, within the last 24h, call sync with
_NOOBSOLETEBUILD
. That's it. That should do no harm if there are builds within the last 24h that are completely finished anyway.
- From an external script detect a build which is the most recent one after at least one day, mark it as important by build tagging and remove that comment later on again
okurz prefers 1.
optional: Follow the approach mentioned in #9760#note-8 which is:
- first check current implementation if changing the priority on currently scheduled jobs has an influence on the test suite or just the job
- do not obsolete old builds but instead on new iso: if jobs for old build are in state scheduled, set priority-10
- if priority of all scheduled jobs within one build are all equal 0, obsolete the build
further details¶
The current manual approach is that the QA reviewer of the day decides this and can mark one build as important.
The time of "24h" is open for discussion, it might be a different time, e.g. 12h, or a number of builds, or a combination of both.
That might be a feature request on openQA itself or maybe on the supporting workflows and the scripts we use for providing media to test.
Also see #9760#note-7 for notes about use of _NOOBSOLETEBUILD
From the sync and trigger side, on a new build: If there are no builds within the last >24h, call sync with _NOOBSOLETEBUILD. That's it. That should do no harm if there are >builds within the last 24h that are completely finished anyway.
I think it should be "If there are no 'completed' builds within 24 hours (ie no builds with skipped jobs)"
This will also then resolve the 'lots of half tested builds' problem by ensuring we do at least one full test of at least one build every 24 hours
- Description updated (diff)
yes, right, included.
- Added: optional implementation proposal about decreasing priority of older scheduled jobs and not obsoleting
- Added: comment that "24h" is arbitrary and could be some other time or a number of builds, etc.
- Target version set to Milestone 5
where are we with this? Is one build finishing?
#13560 was about only syncing complete architectures. In the same change I also prepared for no obsoletion but rbrown and ast prefer it to stay as in before for now: https://gitlab.suse.de/openqa/scripts/merge_requests/59#note_38622
It is just a simple switch but it is off now. the ticket is in state "new", it's not started. Try to convince them, implement it yourself or wait :-)
- Target version changed from Milestone 5 to Milestone 6
- Status changed from New to In Progress
- Assignee set to okurz
gsd#openqa/scripts#69 targets the approach "1." including "optional".
deployed temporarily already on osd and closely monitor logs.
updated MR as the original approach could not work. It would call "deprioritize-or-cancel" every time when "openqa-iso-sync-sles" would be called and there is no build in progress or a dirty repository is present. I changed the approach to call the script from within rsync.pl
just before it triggers an ISO.
so build 0271 came in, solution works in principle but the deprioritizing on all jobs running on a product is done per arch and medium so too much depriorization in one step and also jobs are cancelled then, even running ones
testing the state after initial medium got triggered:
In [11]: jobs = requests.get('https://openqa.suse.de/api/v1/jobs?state=running&state=scheduled&latest=1&flavor=Server-DVD').json()
In [12]: {j['id']: j['settings']['BUILD'] for j in jobs['jobs']}
Out[12]:
{806799: '0267',
806801: '0267',
…
806954: '0267',
807143: '0271',
…
so suggestions to improve:
- match more specifically on the exact jobs of one medium/flavor/arch to deprioritize and cancel
- do not cancel running jobs at all, only scheduled
- Subject changed from finish at least one build per day to [tools]finish at least one build per day
- Description updated (diff)
- Status changed from In Progress to Resolved
now new build 0279 was triggered. https://openqa.suse.de/tests/815422 is an example of an "old" job of build 0277 which is still scheduled but deprioritized. Priority value 60 instead of previously 50. The corresponding job from 0279 is scheduled but with priority 50 so should be executed first.
That should ensure that the user story is fulfilled includidng the (updated) acceptance criteria
Also available in: Atom
PDF