action #179314
opencoordination #154777: [saga][epic] Shareable os-autoinst and test distribution plugins
coordination #162131: [epic] future version control related features in openQA
Improve git conflict handling in save_needle size:M
0%
Description
Observation¶
We got an email report about too many failed minion jobs:
https://openqa.opensuse.org/minion/jobs?state=failed
One failed job: https://openqa.opensuse.org/minion/jobs?id=5038421 showing
created: 2025-03-20T14:38:07.148564Z
...
result:
error: |-
<strong>Failed to save ibus_test_kr-ibus-korean-switch-hangul-20250319.</strong><br><pre>Unable to push Git commit (/var/lib/openqa/share/tests/opensuse/products/opensuse/needles): To github.com:os-autoinst/os-autoinst-needles-opensuse.git
! [rejected] master -> master (fetch first)
error: failed to push some refs to 'github.com:os-autoinst/os-autoinst-needles-opensuse.git'
hint: Updates were rejected because the remote contains work that you do not
hint: have locally. This is usually caused by another repository pushing to
hint: the same ref. If you want to integrate the remote changes, use
hint: 'git pull' before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.</pre>
There were 5 failed jobs like this around the same time: https://openqa.opensuse.org/minion/jobs?task=save_needle&state=failed&queue=¬e=
At some point the user was able to save the needle.
The problem is that update_remote
and update_branch
are not enabled on o3, so it can happen that a push to the needles repo happened while someone else is trying to save a new needle.
Only when those two configuration options are enabled the code will try to rebase on the upstream master.
But another problem is that update_branch
needs a branch name, and that might not be the same for all repos (e.g. it's main
for os-autoinst-distri-example
).
Acceptance Criteria¶
- AC1: The user of the openQA webUI needle editor is less likely to see an error about failed git push, e.g. on two users conflicting
- AC2: In the general case of unrecovered git errors the error detail is still shown to users
Suggestions¶
- Make
save_needle
use the same approach asgit_clone
to find out the upstream default branch (instead of having to configureupdate_branch
) - Consider to still keep the config option to be able to override
- Also check
delete_needles
if it would need a similar change - Code: https://github.com/os-autoinst/openQA/blob/master/etc/openqa/openqa.ini#L108-L111 & https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Git.pm#L68
- https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/Git.pm#L145 We do already have
get_remote_default_branch()
which we can use here - https://app.codecov.io/gh/os-autoinst/openQA/blob/master/lib%2FOpenQA%2FGit.pm is already 100% covered so not that much risk there :) So is https://app.codecov.io/gh/os-autoinst/openQA/tree/master/lib%2FOpenQA%2FTask%2FGit
- https://github.com/os-autoinst/openQA/blob/master/t/16-utils-runcmd.t or maybe https://github.com/os-autoinst/openQA/blob/master/t/14-grutasks-git.t#L376
Updated by tinita 13 days ago
The reason is that on o3 save_needle does not attempt to pull from the remote first.
[scm git]
## name of remote to get updates from before committing changes (e.g. origin, leave out-commented to disable remote update)
#update_remote = origin
## name of branch to rebase against before committing changes (e.g. origin/master, leave out-commented to disable rebase)
#update_branch = origin/master
This will only be done if these options are enabled.
We should enable the first one.
(On osd both are enabled.)
Regarding update_branch
- OpenQA::Task::Needle::Save (calling $git->set_to_latest_master
) is a bit outdated here. For OpenQA::Task::Git::Clone we are querying the server to find out the default branch. We should do that here too instead of using a hardcoded default branch.
The failed minion jobs are not something to worry in this case. The user retried according to the following minion jobs and finally succeeded.
But we should improve the code.
Updated by okurz 12 days ago
- Assignee set to tinita
- Priority changed from Urgent to Normal
I checked on o3 and found ariel:/var/lib/openqa/share/tests/opensuse/products/opensuse/needles # sudo -u geekotest git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
so the repo repaired itself. Means we can follow up with the suggestion from #179314-1 so tinita please update the ticket or create a new one about the long-term plan for improvement.
Updated by tinita 12 days ago
- Subject changed from opensuse.org openqa.opensuse.org openqa_minion_jobs minion 'Minion Jobs - see https://openqa.opensuse.org/minion/jobs?state=failed', multiple failures in save_needle to Improve git conflict handling in save_needle
- Description updated (diff)
- Category changed from Regressions/Crashes to Feature requests
- Assignee deleted (
tinita)
Updated by okurz 6 days ago
- Project changed from openQA Infrastructure (public) to openQA Project (public)
- Subject changed from Improve git conflict handling in save_needle to Improve git conflict handling in save_needle size:M
- Description updated (diff)
- Category changed from Feature requests to Feature requests
- Status changed from New to Workable