Project

General

Profile

Actions

action #116911

closed

[openQA][needle] Can not commit new needle for test suite on openqa.suse.de

Added by waynechen55 over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2022-09-21
Due date:
% Done:

0%

Estimated time:

Description

Observation

Can not commit new needle for test suite on openqa.suse.de, the error is:
Unable to add via git: fatal: Unable to create /var/lib/openqa/share/tests/sle/product/sle/needles/.git/index.lock: File exists.
Please refer to following screenshot:
can_not_commit_new_needle

Steps to reproduce

  • openQA needle editor for a job and select tags and regions to be matched/not-matched
  • Click save

Impact

Any failures because of needle mismatching can not be solved and openQA test run will not pass

Problem

None

Suggestion

None

Workaround

None


Files

can_not_commit_new_needle.png (87 KB) can_not_commit_new_needle.png waynechen55, 2022-09-21 06:32

Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #116722: openqa.suse.de is not reachable 2022-09-18, no ping response, postgreSQL OOM and kernel panics size:MResolvedmkittler2022-09-18

Actions
Actions #1

Updated by waynechen55 over 1 year ago

  • Project changed from openQA Project to openQA Infrastructure
Actions #2

Updated by okurz over 1 year ago

  • Priority changed from High to Urgent
  • Target version set to Ready
Actions #3

Updated by okurz over 1 year ago

I manually fixed that now.

openqa:/home/okurz # sudo -u geekotest /usr/share/openqa/script/fetchneedles 
fatal: Unable to create '/var/lib/openqa/share/tests/opensuse/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
error: could not detach HEAD
Use force=1 to discard uncommitted changes before rebasing
openqa:/home/okurz # sudo -u geekotest bash -ex /usr/share/openqa/script/fetchneedles 
+ : geekotest
+ : www
+ : openSUSE
+ : opensuse
+ : https://github.com/os-autoinst/os-autoinst-distri-opensuse.git
+ : master
+ : openqa@
+ : 'openQA web UI'
+ : opensuse
+ : 0
+ : 1
+ : https://github.com/os-autoinst/os-autoinst-needles-opensuse.git
+ : master
+ : 0
+ : 0
+ '[' '' = -h ']'
+ '[' '' = --help ']'
+ dir=/var/lib/openqa/share/tests
+ '[' -w / ']'
+ '[' 0 = 1 ']'
+ target=/var/lib/openqa/share/tests/opensuse
+ mkdir -p /var/lib/openqa/share/tests/opensuse
+ cd /var/lib/openqa/share/tests/opensuse
+ '[' '!' -d .git ']'
+ do_fetch /var/lib/openqa/share/tests/opensuse
+ target=/var/lib/openqa/share/tests/opensuse
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
fatal: Unable to create '/var/lib/openqa/share/tests/opensuse/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
error: could not detach HEAD
+ fail 'Use force=1 to discard uncommitted changes before rebasing'
+ echo 'Use force=1 to discard uncommitted changes before rebasing'
Use force=1 to discard uncommitted changes before rebasing
+ exit 1
openqa:/home/okurz # cat /var/lib/openqa/share/tests/opensuse/.git/index.lock
openqa:/home/okurz # ls -ltra /var/lib/openqa/share/tests/opensuse/.git/index.lock
-rw-r--r-- 1 geekotest nogroup 0 Sep 20 02:34 /var/lib/openqa/share/tests/opensuse/.git/index.lock
openqa:/home/okurz # rm /var/lib/openqa/share/tests/opensuse/.git/index.lock
openqa:/home/okurz # sudo -u geekotest bash -ex /usr/share/openqa/script/fetchneedles 
+ : geekotest
+ : www
+ : openSUSE
+ : opensuse
+ : https://github.com/os-autoinst/os-autoinst-distri-opensuse.git
+ : master
+ : openqa@
+ : 'openQA web UI'
+ : opensuse
+ : 0
+ : 1
+ : https://github.com/os-autoinst/os-autoinst-needles-opensuse.git
+ : master
+ : 0
+ : 0
+ '[' '' = -h ']'
+ '[' '' = --help ']'
+ dir=/var/lib/openqa/share/tests
+ '[' -w / ']'
+ '[' 0 = 1 ']'
+ target=/var/lib/openqa/share/tests/opensuse
+ mkdir -p /var/lib/openqa/share/tests/opensuse
+ cd /var/lib/openqa/share/tests/opensuse
+ '[' '!' -d .git ']'
+ do_fetch /var/lib/openqa/share/tests/opensuse
+ target=/var/lib/openqa/share/tests/opensuse
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
+ '[' 1 = 1 ']'
+ test -d products
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/caasp/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/casp/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/leap-micro/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/microos/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/opensuse/needles ']'
+ cd /var/lib/openqa/share/tests/opensuse/products/opensuse/needles
+ git_update_needles
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
fatal: Unable to create '/var/lib/openqa/share/tests/opensuse/products/opensuse/needles/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
error: could not detach HEAD
+ fail 'Use force=1 to discard uncommitted changes before rebasing'
+ echo 'Use force=1 to discard uncommitted changes before rebasing'
Use force=1 to discard uncommitted changes before rebasing
+ exit 1
openqa:/home/okurz # flock ^C
openqa:/home/okurz # lsof /var/lib/openqa/share/tests/opensuse/products/opensuse/needles/.git/index.lock
openqa:/home/okurz # echo $?
1
openqa:/home/okurz # rm /var/lib/openqa/share/tests/opensuse/products/opensuse/needles/.git/index.lock
openqa:/home/okurz # sudo -u geekotest bash -ex /usr/share/openqa/script/fetchneedles 
+ : geekotest
+ : www
+ : openSUSE
+ : opensuse
+ : https://github.com/os-autoinst/os-autoinst-distri-opensuse.git
+ : master
+ : openqa@
+ : 'openQA web UI'
+ : opensuse
+ : 0
+ : 1
+ : https://github.com/os-autoinst/os-autoinst-needles-opensuse.git
+ : master
+ : 0
+ : 0
+ '[' '' = -h ']'
+ '[' '' = --help ']'
+ dir=/var/lib/openqa/share/tests
+ '[' -w / ']'
+ '[' 0 = 1 ']'
+ target=/var/lib/openqa/share/tests/opensuse
+ mkdir -p /var/lib/openqa/share/tests/opensuse
+ cd /var/lib/openqa/share/tests/opensuse
+ '[' '!' -d .git ']'
+ do_fetch /var/lib/openqa/share/tests/opensuse
+ target=/var/lib/openqa/share/tests/opensuse
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
+ '[' 1 = 1 ']'
+ test -d products
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/caasp/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/casp/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/leap-micro/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/microos/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/opensuse/needles ']'
+ cd /var/lib/openqa/share/tests/opensuse/products/opensuse/needles
+ git_update_needles
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
++ git rev-parse --abbrev-ref --symbolic-full-name HEAD
+ '[' master = HEAD ']'
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/sle-micro/needles ']'
+ continue
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/sle/needles ']'
+ cd /var/lib/openqa/share/tests/opensuse/products/sle/needles
+ git_update_needles
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
++ git rev-parse --abbrev-ref --symbolic-full-name HEAD
+ '[' master = HEAD ']'
+ for nd in products/*/needles
+ '[' -h /var/lib/openqa/share/tests/opensuse/products/windows/needles ']'
+ cd /var/lib/openqa/share/tests/opensuse/products/windows/needles
+ git_update_needles
+ git_update master
+ branch=master
+ git gc --auto --quiet
+ git fetch -q origin
+ '[' 0 = 1 ']'
+ git rebase -q origin/master
++ git rev-parse --abbrev-ref --symbolic-full-name HEAD
+ '[' master = HEAD ']'

I think we should extend fetchneedles to call "lsof" on the lock file and remove the file if not used by any process (exit code 1 of lsof). Or can git somehow handle that better itself?

Actions #4

Updated by okurz over 1 year ago

  • Related to action #116722: openqa.suse.de is not reachable 2022-09-18, no ping response, postgreSQL OOM and kernel panics size:M added
Actions #5

Updated by waynechen55 over 1 year ago

Still failed and same error. Will try again tomorrow.

Actions #6

Updated by mkittler over 1 year ago

  • Assignee set to mkittler
Actions #7

Updated by mkittler over 1 year ago

  • Status changed from New to Feedback

Please try whether it works again. I've just removed the lock file which is likely a leftover of the unclean exit due to yet another kernel panic (see #116722).

There are now a few untracked changes left in the repo. I suppose they don't hurt anyone so I'll leave them there in case somebody wants to recover the files from openqa.suse.de:/var/lib/openqa/share/tests/sle/products/sle/needles.

I suppose it'll not be a good idea to re-trigger failed Minion jobs (https://openqa.suse.de/minion/jobs?state=failed&task=save_needle) because the changes might have already been otherwise committed and some of the failures might be duplicates (from users retrying). So I'll just delete them later.

Actions #8

Updated by okurz over 1 year ago

mkittler wrote:

Please try whether it works again. I've just removed the lock file which is likely a leftover of the unclean exit due to yet another kernel panic (see #116722).

I have already done that some hours ago, see #116911#note-3. Were there problems right now again?

Actions #9

Updated by mkittler over 1 year ago

Not sure what you did but the error fatal: Unable to create '/var/lib/openqa/share/tests/opensuse/.git/index.lock': File exists. persisted (see also the comment of @waynechen55).

Actions #10

Updated by okurz over 1 year ago

mkittler wrote:

Not sure what you did but the error fatal: Unable to create '/var/lib/openqa/share/tests/opensuse/.git/index.lock': File exists. persisted (see also the comment of @waynechen55).

Well, what I did is what is written in #116911#note-3 which shows that after handling the lock files fetchneedles completed without any errors.

Actions #11

Updated by mkittler over 1 year ago

  • Status changed from Feedback to Resolved

So it must've either broken again. However, it hasn't happened yet another time (https://openqa.suse.de/minion/jobs?state=failed&task=save_needle is empty at the time of writing this comment) so I suppose the issue can be resolved. (And the crashes of OSD that were causing the issue in the first place should now at least be worked around correctly.)

Actions #12

Updated by okurz over 1 year ago

  • Status changed from Resolved to Feedback

I still think we should extend fetchneedles to call "lsof" on the lock file and remove the file if not used by any process (exit code 1 of lsof). Or can git somehow handle that better itself?

Actions #14

Updated by mkittler over 1 year ago

  • Status changed from Feedback to Resolved

The changes have been deployed on OSD and tests/needles are still synced. So the changes at least did not break the normal operation of the script.

I've also added a stale lock file and it seems to be removed correctly:

sudo -u geekotest bash -ex /usr/share/openqa/script/fetchneedles
…
+ lock_path=.git/index.lock
+ '[' -e .git/index.lock ']'
+ fuser --silent .git/index.lock
+ echo 'removing stale lock /var/lib/openqa/share/tests/opensuse/.git/index.lock'
removing stale lock /var/lib/openqa/share/tests/opensuse/.git/index.lock
+ rm .git/index.lock
+ git gc --auto --quiet
+ git fetch -q origin
…
Actions

Also available in: Atom PDF