Actions
action #61221
openopenQA Project - coordination #58184: [saga][epic][use case] full version control awareness within openQA
openQA Project - coordination #45302: [epic] smarter fetchneedles (was: fetchneedles should ensure we are always on a branch (and try to self-repair))
osd: unable to save needles, minion fails with "fatal: Unable to create '/var/lib/openqa/.../needles/.git/index.lock'"
Start date:
2019-12-20
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
grafana monitoring alert failed:
[osd-admins] [Alerting] Minion Jobs alert
From: Grafana <osd-admin@suse.de>
To: osd-admins@suse.de
Sender: osd-admins <osd-admins-bounces+okurz=suse.de@suse.de>
List-Id: <osd-admins.suse.de>
Date: 19/12/2019 16.05
Note: This is an HTML message. For security reasons, only the raw HTML code is shown. If you trust the sender of this message then you can activate formatted HTML display for this message by clicking here.
*/[Alerting] Minion Jobs alert/*
Too many failed Minion jobs
*Metric name*
*Value*
Failed
21.505
https://openqa.suse.de/minion/jobs?state=failed&offset=0 shows e.g. https://openqa.suse.de/minion/jobs?id=300982 with the details:
{
"args" => [
{
"imagedir" => "",
"imagedistri" => undef,
"imagename" => "partitioning_raid-9.png",
"imageversion" => undef,
"job_id" => 3727183,
"needle_json" => "{\r\n \"area\": [\r\n {\r\n \"height\": 43,\r\n \"ypos\": 725,\r\n \"xpos\": 956,\r\n \"type\": \"match\",\r\n \"width\": 68\r\n },\r\n {\r\n \"type\": \"match\",\r\n \"width\": 20,\r\n \"xpos\": 84,\r\n \"ypos\": 209,\r\n \"height\": 54\r\n }\r\n ],\r\n \"properties\": [],\r\n \"tags\": [\r\n \"ENV-15SP2ORLATER-1\",\r\n \"partitioning_raid-hard_disks-unfolded\",\r\n \"storage-ng\"\r\n ]\r\n}",
"needledir" => "/var/lib/openqa/share/tests/sle/products/sle/needles",
"needlename" => "partitioning_raid-hard_disks-unfolded-icon_scheme-hyperv-20191219",
"overwrite" => undef,
"user_id" => 194
}
],
"attempts" => 1,
"children" => [],
"created" => "2019-12-19T15:28:05.15569Z",
"delayed" => "2019-12-19T15:28:05.15569Z",
"finished" => "2019-12-19T15:28:07.02545Z",
"id" => 300982,
"notes" => {
"gru_id" => 27415232,
"ttl" => 60
},
"parents" => [],
"priority" => 10,
"queue" => "default",
"result" => {
"error" => "<strong>Failed to save partitioning_raid-hard_disks-unfolded-icon_scheme-hyperv-20191219.</strong><br><pre>Unable to add via Git: fatal: Unable to create '/var/lib/openqa/share/tests/sle/products/sle/needles/.git/index.lock': File exists.\n\nAnother git process seems to be running in this repository, e.g.\nan editor opened by 'git commit'. Please make sure all processes\nare terminated then try again. If it still fails, a git process\nmay have crashed in this repository earlier:\nremove the file manually to continue.</pre>"
},
"retried" => undef,
"retries" => 0,
"started" => "2019-12-19T15:28:05.17528Z",
"state" => "failed",
"task" => "save_needle",
"time" => "2019-12-20T06:15:21.83364Z",
"worker" => 294
}
In the needle directory on osd I can see:
geekotest@openqa:~/share/tests/sle/products/sle/needles> git status
On branch master
Your branch is up to date with 'origin/master'.
Untracked files:
(use "git add <file>..." to include in what will be committed)
partitioning_raid-hard_disks-unfolded-icon_scheme-hyperv-20191219.json
partitioning_raid-hard_disks-unfolded-icon_scheme-hyperv-20191219.png
nothing added to commit but untracked files present (use "git add" to track)
so files are created, branch is clean but files are not commited and not pushed.
There are many more failed minion jobs, mainly about "TTL Expired".
Problem¶
- H1: While
fetchneedles-sles
was running the needle commit minion job was running and failing on the already locked directory.
Workaround¶
Commit manually and push.
Actions