Project

General

Profile

Actions

action #113030

closed

test distribution directory git revision can be parsed as "UNKNOWN" and openQA investigation fails to show test git log size:M

Added by okurz over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2022-06-24
Due date:
% Done:

0%

Estimated time:

Description

Observation

As found on o3 worker machines running a more recent version of git the test distribution git revision which is also used in the "investigation" information in openQA can be "UNKNOWN" if a correct git version can not be parsed.

Steps to reproduce

Minimal reproducer equivalent of what os-autoinst does:

podman run --rm -it registry.opensuse.org/opensuse/leap:15.3 /bin/sh -c 'zypper -n in sudo shadow git-core && useradd -M user2 && cd /tmp && mkdir git1 && cd git1 && git init && touch README && git config --global user.email "root@local" && git config --global user.name root && git add README && git commit -m "commit 1" && sudo -u user2 git -C /tmp/git1 rev-parse HEAD'

Leap 15.3 has git-core 2.35.3. The problem does not reproduce with Leap 15.2 with git-core 2.26.2 but we are running on Leap 15.3 already for longer. However the problem only appears when not using the worker cache.

Problem

os-autoinst calls git with the same user that isotovideo is called which is using the same username as openQA worker, by default "_openqa-worker". The main test distribution directories like /var/lib/openqa/share on the openQA webUI belong to the user "geekotest" so a differing user. Commonly we don't run into this problem because we use the worker cache solution which runs as the same user as the openQA worker, i.e. "_openqa-worker", so the owner matches. In the problematic case when one does not use the openQA worker cache, same as if one would run on just localhost I assume git would complain. That's already the same observation for the vanilla Leap 15.3 git version. Only the git version in Leap 15.2 (2.26.2) does not yet have that strict behaviour.

Suggestion

  • We could follow git's suggestion and call git config --global --add safe.directory $dir. Or actually as detailed in https://github.blog/2022-04-18-highlights-from-git-2-36/#stricter-repository-ownership-checks we could just run git config --global --add safe.directory '*'. Unfortunately this is only supported in git >= 2.36 so not the current Leap 15.3 or 15.4 git version 2.35.3 but might be enough. However to be safe we should likely call this command just before the rev-parse command anyway but check if it's not already in the config as otherwise it would be added multiple times.

Related issues 1 (0 open1 closed)

Related to openQA Project (public) - action #113507: [logwarn] fatal: ambiguous argument '(unreadable git hash)..abcdef': unknown revision or path not in the working treeResolvedtinita2022-07-12

Actions
Actions #1

Updated by okurz over 2 years ago

  • Due date set to 2022-07-08
  • Status changed from New to Feedback
Actions #2

Updated by okurz over 2 years ago

I had the PR prepared which I rolled out on openqaworker7 but it failed due to some restricted permissions. Quickly fixed with apparmor adaptions, see https://github.com/os-autoinst/openQA/pull/4729 . But now the problem is that the user _openqa-worker has as home directory /var/lib/empty which does not allow writing to the user _openqa-worker. Adding safe directory to /etc/gitconfig would be a possibility but that means that every user is affected:

podman run --rm -it registry.opensuse.org/opensuse/leap:15.3 /bin/sh -c 'zypper -n in sudo shadow git-core && useradd -M user2 && cd /tmp && mkdir git1 && cd git1 && git init && touch README && git config --global user.email "root@local" && git config --global user.name root && git add README && git commit -m "commit 1" && sudo echo -e "[safe]\n\tdirectory = \"*\"" >> /etc/gitconfig && sudo -u user2 git -C /tmp/git1 rev-parse HEAD'

By the way, a minimum reproducer is likely using useradd --home-dir /var/lib/empty -M user2 to be more comparable with how _openqa-worker is configured.

Actions #3

Updated by ph03nix over 2 years ago

Workaround

A possible workaround that worked for me was to create a .gitconfig file for the _openqa-worker user with the following contents:

[safe]
    directory = /var/lib/openqa/share/tests/os-autoinst-distri-opensuse/products/opensuse/os-autoinst-needles-opensuse
    directory = /var/lib/openqa/share/tests/os-autoinst-distri-opensuse/products/sle/os-autoinst-needles-sles
    directory = /var/lib/openqa/share/tests/os-autoinst-distri-opensuse

Copy&paste

Aka, the "JUST MAKE IT WORK ALREADY" approach.

mkdir /home/openqa-worker
chown _openqa-worker 
sudo -u _openqa-worker echo -e '[safe]\ndirectory = /var/lib/openqa/share/tests/os-autoinst-distri-opensuse/products/opensuse/os-autoinst-needles-opensuse\ndirectory = /var/lib/openqa/share/tests/os-autoinst-distri-opensuse/products/sle/os-autoinst-needles-sles\ndirectory = /var/lib/openqa/share/tests/os-autoinst-distri-opensuse' > /home/openqa-worker/.gitconfig

You will need to adapt the home directory for _openqa-worker in /etc/passwd from /var/lib/empty to /home/openqa-worker

Reboot and you should be good to go.

Alternative workaround

Clone the job and add the following settings: CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-opensuse.git#master NEEDLES_DIR=https://gitlab.suse.de/openqa/os-autoinst-needles-sles/#master

Ultimate workaround

Hell, yeah.

sudo git config --system --add safe.directory '*'
Actions #4

Updated by okurz over 2 years ago

Thanks a lot. I wanted to avoid creating a specific home directory for _openqa-worker but I could not find any environment setting or else for the git config so either we parse the git files manually or I assume we need to provide a home directory with such config.

Actions #5

Updated by livdywan over 2 years ago

okurz wrote:

Thanks a lot. I wanted to avoid creating a specific home directory for _openqa-worker but I could not find any environment setting or else for the git config so either we parse the git files manually or I assume we need to provide a home directory with such config.

Is there a drawback to using /etc/gitconfig to apply this for every user?

And of course we can add it to .git/config in the copy of the repo itself.

Actions #6

Updated by okurz over 2 years ago

cdywan wrote:

okurz wrote:

Thanks a lot. I wanted to avoid creating a specific home directory for _openqa-worker but I could not find any environment setting or else for the git config so either we parse the git files manually or I assume we need to provide a home directory with such config.

Is there a drawback to using /etc/gitconfig to apply this for every user?

Well, one drawback for sure would be that we would need to supply some tooling with root permissions to be able to write there

And of course we can add it to .git/config in the copy of the repo itself.

Hm, good idea. So we could handle that in the fetchneedles script I guess

Actions #7

Updated by tinita over 2 years ago

For providing a home directory a temporary solution via setting $HOME can be enough, e.g.

HOME=/tmp/mygitconfig
mkdir $HOME
git config --global --add safe.directory dir
# now call git command
cd dir
git rev-parse ...
Actions #8

Updated by tinita over 2 years ago

okurz wrote:

cdywan wrote:

And of course we can add it to .git/config in the copy of the repo itself.

Hm, good idea. So we could handle that in the fetchneedles script I guess

I tried that, and it doesn't have any effect for me.

(And that's probably a feature. if it worked, it would defeat the purpose of the whole thing I guess)

Actions #9

Updated by okurz over 2 years ago

Thanks to all of you. The temporary home directory is a nice idea and actually seems to be easy enough, see the update in https://github.com/os-autoinst/os-autoinst/pull/2104

Actions #10

Updated by tinita over 2 years ago

And just for the record:

man git-config
           This config setting is only respected when specified in a system or global config,
           not when it is specified in a repository config or via the command line option -c
           safe.directory=<path>.

That's why git -c "safe.directory=/path doesn't work (although to me it doesn't look like it would be a risk to support that).

Actions #11

Updated by okurz over 2 years ago

  • Due date changed from 2022-07-08 to 2022-07-15

PR merged. Awaiting feedback from varying production tests

Actions #12

Updated by okurz over 2 years ago

One more apparmor change needed: https://github.com/os-autoinst/openQA/pull/4738, merged.

Actions #13

Updated by okurz over 2 years ago

  • Tags set to reactive work
Actions #14

Updated by livdywan over 2 years ago

  • Related to action #113507: [logwarn] fatal: ambiguous argument '(unreadable git hash)..abcdef': unknown revision or path not in the working tree added
Actions #16

Updated by okurz over 2 years ago

https://openqa.opensuse.org/tests/2467480/logfile?filename=autoinst-log.txt shows os-autoinst running version including a8ee593, i.e. including https://github.com/os-autoinst/os-autoinst/pull/2116 but still failing with an error:

[2022-07-13T11:59:05.781129Z] [debug] +++ worker notes +++
[2022-07-13T11:59:06.526491Z] [debug] Current version is 4.6.1657641990.a8ee593 [interface v26]
fatal: unsafe repository ('/var/lib/openqa/share/tests/opensuse' is owned by someone else)
To add an exception for this directory, call:

    git config --global --add safe.directory /var/lib/openqa/share/tests/opensuse
[2022-07-13T11:59:06.557256Z] [debug] git hash in opensuse: (unreadable git hash)
Actions #17

Updated by okurz over 2 years ago

Trying to reproduce within a clean environment with podman run --rm -it registry.opensuse.org/opensuse/leap:15.4 /bin/sh

zypper ar -p 95 -f 'http://download.opensuse.org/repositories/devel:openQA/openSUSE_Leap_$releasever' devel_openQA
zypper ar -p 90 -f 'http://download.opensuse.org/repositories/devel:openQA:Leap:$releasever/openSUSE_Leap_$releasever' devel_openQA_Leap
zypper -n in git-core sudo os-autoinst
useradd geekotest
useradd -d /var/lib/empty _openqa-worker
mkdir -p /var/lib/openqa/share/tests/
chown geekotest /var/lib/openqa/share/tests
sudo -u geekotest git -C /var/lib/openqa/share/tests clone --depth 1 https://github.com/os-autoinst/os-autoinst-distri-example.git example
(cd /var/lib/openqa/share/tests && ln -s example opensuse)

# succeeds as expected, shows hash, but fails if the supplied directory itself is a symlink
sudo -u _openqa-worker perl -wE '
use Mojo::File qw(path);
my $dirname = path("/var/lib/openqa/share/tests/opensuse")->to_abs;
my $hecksafe = q{git config --global --get safe.directory | grep -q};
my $addsafe  = q{HOME=$(mktemp -d --tmpdir os-autoinst-git.XXXXX) && git config --global --add safe.directory};
my $version = qx{($checksafe "$dirname" && git -C "$dirname" rev-parse HEAD || $addsafe "$rname" && git -C "$dirname" rev-parse HEAD && echo rm -r \$HOME)};
$version ||= "(unreadable git hash)";
chomp($version);
say "HASH: $version"'

# fails as expected, shows git error about unsafe repository
sudo -u _openqa-worker perl -wE '
my $dirname = "/var/lib/openqa/share/tests/opensuse";
my $version = qx{git -C "$dirname" rev-parse HEAD};
$version ||= "(unreadable git hash)";
chomp($version);
say "HASH: $version"' 

# version from tinita using os-autoinst code
sudo -u _openqa-worker perl -I /usr/lib/os-autoinst -wE'                                                                                                                  
use OpenQA::Isotovideo::Utils qw/ checkout_git_repo_and_branch /;
use bmwqemu;
my $dirname = "/var/lib/openqa/share/tests/opensuse";
my $hash = OpenQA::Isotovideo::Utils::git_rev_parse($dirname);
say "HASH: $hash"'

# this might work reading "realpath"
sudo -u _openqa-worker perl -wE '
use Mojo::File qw(path);
my $dirname = path("/var/lib/openqa/share/tests/opensuse")->realpath;
my $checksafe = q{git config --global --get safe.directory | grep -q};
my $addsafe = q{HOME=$(mktemp -d --tmpdir os-autoinst-git.XXXXX) && git config --global --add safe.directory};
my $version = qx{($checksafe "$dirname" && git -C "$dirname" rev-parse HEAD || $addsafe "$dirname" && git -C "$dirname" rev-parse HEAD && echo rm -r \$HOME)};
$version ||= "(unreadable git hash)";
chomp($version);
say "HASH: $version"'

So doing https://github.com/os-autoinst/os-autoinst/pull/2119 to fix the problems with symlinks.
There might still be a problem with a trailing slash. Tried to reproduce the failure first, e.g. in t/99-full-stack.t, but haven't succeeded so far. PR 2119 merged. Next with IMHO nicer Mojo::File usage https://github.com/os-autoinst/os-autoinst/pull/2120

I realized there is one further problem: If the specified directory is not the root of a git repo, e.g. as we have within the os-autoinst repo itself within t/data/tests the first line return 'UNKNOWN' unless -e "$dirname/.git"; returns early. We can see that in t/99-full-stack.t. Maybe we can use the "error message" output from git directly and call it. So either the initial call to git rev-parse works or we add a safe directory and then call again. For this we should still keep commit c9631656 "Allow running tests (invoking isotovideo) without Git" related to #111605 in mind.

https://github.com/os-autoinst/os-autoinst/pull/2120 merged. Awaiting further results from production before thinking about next steps

Actions #18

Updated by okurz over 2 years ago

  • Due date changed from 2022-07-15 to 2022-07-22
Actions #19

Updated by mkittler over 2 years ago

  • Subject changed from test distribution directory git revision can be parsed as "UNKNOWN" and openQA investigation fails to show test git log to test distribution directory git revision can be parsed as "UNKNOWN" and openQA investigation fails to show test git log size:M
Actions #21

Updated by okurz over 2 years ago

https://openqa.opensuse.org/admin/workers/392 was updated by ggardet. https://openqa.opensuse.org/tests/2472996/logfile?filename=autoinst-log.txt shows a recent output:

[2022-07-18T18:32:26.074508Z] [debug] Current version is 4.6.1658158082.102b55e [interface v26]
[2022-07-18T18:32:26.100121Z] [debug] git hash in opensuse: 277518ece7b102ebfc403073a5996429f1c04852
…
[2022-07-18T18:32:26.792040Z] [debug] git hash in opensuse/products/opensuse/needles: 84c616a905e3dfa0412317bd216a5c2a573b8e99

so https://github.com/os-autoinst/os-autoinst/pull/2119 fixed it. Still open to have https://github.com/os-autoinst/os-autoinst/pull/2121 in which brings in further improvements and better tests. This was problematic for some days due to the switch to Tumbleweed, Tumbleweed switching to perl 5.36, a problem in OBS, and so on…

Actions #22

Updated by okurz over 2 years ago

https://github.com/os-autoinst/os-autoinst/pull/2121 merged with complete test coverage. Waiting one more turn for verification from production.

Actions #23

Updated by livdywan over 2 years ago

okurz wrote:

https://github.com/os-autoinst/os-autoinst/pull/2121 merged with complete test coverage. Waiting one more turn for verification from production.

[2022-07-20T07:14:49.814152Z] [warn] [pid:12503] fatal: ambiguous argument '(unreadable git hash)..(unreadable git hash)': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

I see a lot of these via OpenQA logreport for ariel. I guess the test coverage wasn't complete for what we need in production.

Let's ignore the message for now.

Actions #24

Updated by okurz over 2 years ago

cdywan wrote:

I see a lot of these via OpenQA logreport for ariel. I guess the test coverage wasn't complete for what we need in production.

It was complete. As I explained this is a problem from openqa-investigate looking up old jobs with unreadable git hash.

Actions #25

Updated by livdywan over 2 years ago

okurz wrote:

cdywan wrote:

I see a lot of these via OpenQA logreport for ariel. I guess the test coverage wasn't complete for what we need in production.

It was complete. As I explained this is a problem from openqa-investigate looking up old jobs with unreadable git hash.

I had a feeling you would question my phrasing :-D My point, which should be obvious, is that all cases are not covered or it wouldn't be failing now.

Actions #26

Updated by livdywan over 2 years ago

cdywan wrote:

Let's ignore the message for now.

FYI Not effective yet. It seems like I discovered an issue with logwarn which makes the unit tests pass regardless of wether the expressions are correct. Currently unsure what to do about that.

Actions #27

Updated by livdywan over 2 years ago

cdywan wrote:

cdywan wrote:

Let's ignore the message for now.

FYI Not effective yet. It seems like I discovered an issue with logwarn which makes the unit tests pass regardless of wether the expressions are correct. Currently unsure what to do about that.

Sorted now with help from Tina

Actions #29

Updated by okurz over 2 years ago

  • Due date changed from 2022-07-22 to 2022-07-29

merged and deployed to o3.

Created https://github.com/os-autoinst/openqa-logwarn/pull/43 to remove openqa-logwarn handling again.

Actions #30

Updated by okurz over 2 years ago

  • Due date deleted (2022-07-29)
  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF