action #165408
closedUnreviewed issue (Group 24 openQA) test-running in openQA failing with error detecting default remote branch size:S
0%
Description
Observation¶
https://openqa.opensuse.org/tests/4408286#step/test_running/12
/usr/bin/env: git: Permission denied
From https://openqa.opensuse.org/tests/4408284/logfile?filename=test_running-openqa_services.log.txt
Aug 16 07:05:26 susetest openqa-gru[126234]: [info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git ls-remote --symref https://github.com/os-autoinst/os-autoinst-distri-example.git HEAD
Aug 16 07:05:26 susetest openqa-gru[126234]: [55B blob data]
Aug 16 07:05:26 susetest openqa-gru[126234]: [error] cmd returned 32256
From https://openqa.opensuse.org/tests/4408284/logfile?filename=autoinst-log.txt
[2024-08-16T11:12:25.875198Z] [info] [pid:16517] ::: basetest::runtest: # Test died: command 'retry -s 5 -r 3 -- sh -c 'test -f /var/lib/openqa/share/tests/*/.git/config'' failed: the test distribution should be checked out by openQA automatically at /usr/lib/os-autoinst/autotest.pm line 412.
https://openqa.opensuse.org/tests/4408286/logfile?filename=test_running-openqa_minion_results.txt
[
"Job successfully executed",
"Error detecting remote default branch name for 'https://github.com/os-autoinst/os-autoinst-distri-example.git': /usr/bin/env: \x{e2}\x{80}\x{98}git\x{e2}\x{80}\x{99}: Permission denied at /usr/share/openqa/script/../lib/OpenQA/Task/Git/Clone.pm line 66.\n",
"Job successfully executed"
]
Note that this should read ‘git’
, it's just not encoded correctly in the error message.
Acceptance criteria¶
- AC1: Jobs don't fail because
git
cannot be executed.
Suggestions¶
- Note
env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git ls-remote ...
- is why "env" is failing here - Try and fix the encoding i.e. ` ending up garbled in the JSON
- This might be caused by apparmor, which is enabled on o3/ in openQA-in-openQA tests
- Why is this sporadic? Is it a timing issue?
- Run the test 1000 times
- Extend the existing logging e.g. to reveal what is causing the permission issues
Files
Updated by tinita 3 months ago
- Has duplicate action #165411: test fails in test_running: /usr/bin/env: git: Permission denied added
Updated by tinita 3 months ago
Why do we actually run env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git ls-remote ...
instead of
local $ENV{GIT_SSH_COMMAND} = 'ssh -oBatchMode=yes';
# then run git ls-remote
I guess this won't solve the problem as I don't know why we get the Permission denied
, but the code could look easier.
Updated by livdywan 3 months ago
I tried to investigate this but I'm not sure this is reproducible anymore?
tinita wrote in #note-7:
Why do we actually run
env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git ls-remote ...
openQA / lib/OpenQA/Task/Git/Clone.pm follows the same pattern as os-autoinst / OpenQA/Isotovideo/Utils.pm, both introduced by @dheidler if I see it correctly. My guess is it just makes reproducing it easier as you can copy the entire command.
Updated by tinita 3 months ago · Edited
livdywan wrote in #note-9:
I tried to investigate this but I'm not sure this is reproducible anymore?
Well, we see that failure sporadically, and now we actually see the error message. That was added as part of #163784
Probably it makes sense to schedule this a hundred times or so, but I would add some debugging around it. This error means that git
was found in $PATH but is not executable or readable. At least I can't think of a different cause for this.
edit: or apparmor doesn't allow it, then I think the error would be the same
Updated by tinita 3 months ago
- Related to action #163784: [openqa-in-openqa] test fails in test_running on "retry -s 10 -r 6 -e -- sh -c 'test -f /var/lib/openqa/share/tests/*/.git/config", probably due to transient network errors size:S added
Updated by ybonatakis 3 months ago
https://openqa.opensuse.org/tests/overview?build=iob&distri=openqa&version=Tumbleweed 100 jobs passed (running twice)
Updated by ybonatakis 3 months ago · Edited
I checked the logs at the time of the https://openqa.opensuse.org/tests/4408286#step/test_running/12 (started 2024-08-16 11:16:20)
Aug 16 11:21:01 ariel systemd[1]: Started Continuously deploys openQA, see https://progress.opensuse.org/issues/105379.
Aug 16 11:21:01 ariel systemd[1]: Started Session c40651 of User geekotest.
Aug 16 11:21:01 ariel systemd[1]: Started Session c40652 of User geekotest.
Aug 16 11:21:01 ariel systemd[1]: Started Session c40653 of User geekotest.
Aug 16 11:21:01 ariel cron[19389]: pam_unix(crond:session): session opened for user geekotest by (uid=0)
Aug 16 11:21:01 ariel cron[19387]: pam_unix(crond:session): session opened for user geekotest by (uid=0)
Aug 16 11:21:01 ariel cron[19388]: pam_unix(crond:session): session opened for user geekotest by (uid=0)
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: devel:openQA looks bad for Leap 15.5 (x86_64):
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <resultlist state="7031733d2b1c68ca877c892b5908a0eb">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <result project="devel:openQA" repository="15.5" arch="x86_64" code="building" state="building" dirty="true">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="isotovideo-os-autoinst-distri-opensuse" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="isotovideo-qemu-kvm" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="isotovideo-qemu-kvm-jq" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="isotovideo-qemu-x86" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="isotovideo-qemu-x86-jq" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA" code="building">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <details>building on i02-ch2c:8</details>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </status>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA-devel-container" code="disabled"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA-single-instance-codespaces-container" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA-single-instance-container" code="disabled"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA:openQA-client-test" code="unresolvable">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <details>nothing provides openQA-client = 4.6.1723806881.14d72391, (got version 4.6.1723805254.25dc7a3f-lp155.7062.1)</details>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </status>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA:openQA-devel-test" code="unresolvable">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <details>nothing provides openQA-devel = 4.6.1723806881.14d72391, (got version 4.6.1723805254.25dc7a3f-lp155.7062.1)</details>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </status>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA:openQA-test" code="unresolvable">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <details>nothing provides openQA = 4.6.1723806881.14d72391, (got version 4.6.1723805254.25dc7a3f-lp155.7062.1)</details>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </status>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA:openQA-worker-test" code="unresolvable">
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <details>nothing provides openQA-worker = 4.6.1723806881.14d72391, (got version 4.6.1723805254.25dc7a3f-lp155.7062.1)</details>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </status>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA_container_client" code="disabled"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA_container_webUI" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA_container_webUI_loadbalancer" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA_container_worker" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openQA_container_worker_os_autoinst_distri_opensuse" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="opencv" code="disabled"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="openqa_bugfetcher" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst-distri-opensuse-deps" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst-scripts" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst-scripts-container" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst:os-autoinst-devel-test" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst:os-autoinst-openvswitch-test" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst:os-autoinst-test" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="os-autoinst_dev" code="excluded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="python-openqa_client" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="qem-bot-dev" code="disabled"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: <status package="xterm-console" code="succeeded"/>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </result>
Aug 16 11:21:01 ariel openqa-continuous-update[19449]: </resultlist>
Aug 16 11:21:01 ariel systemd[1]: openqa-continuous-update.service: Deactivated successfully.
Aug 16 11:21:01 ariel CRON[19389]: pam_unix(crond:session): session closed for user geekotest
Updated by livdywan 3 months ago
- Subject changed from Unreviewed issue (Group 24 openQA) test-running in openQA failing with error detecting default remote branch to Unreviewed issue (Group 24 openQA) test-running in openQA failing with error detecting default remote branch size:S
- Description updated (diff)
Updated by tinita 3 months ago
- Related to action #155413: Ensure apparmor is enforced in openQA-in-openQA tests size:M added
Updated by ybonatakis 3 months ago
logs from 2024-08-16 are lost in openqaworker26.
But there is a recent failing job https://openqa.opensuse.org/tests/4422622#step/test_running/9
I found again a similar "looks bad" error
Aug 22 10:39:34 openqaworker22 systemd[1]: openqa-continuous-update.service: Deactivated successfully.
Aug 22 10:45:29 openqaworker22 systemd[1]: Started Continuously deploys openQA, see https://progress.opensuse.org/issues/105379.
Aug 22 10:45:29 openqaworker22 openqa-continuous-update[4137]: devel:openQA looks bad for Leap 15.5 (x86_64):
Aug 22 10:45:29 openqaworker22 openqa-continuous-update[4137]: <resultlist state="b47e07375eb3a2402e03653ced95fb5d">
but I didnt associate the time with the test time. Now is 2:50 but the worker is 12:50.
The error appears twice in the last two/three hours
checked openqa-worker-cacheservice
and openqa-worker-cacheservice-minion
but I am not sure there is something wrong here.
Aug 22 11:03:52 openqaworker22 systemd[1]: openqa-worker-cacheservice-minion.service: Main process exited, code=exited, status=72/OSFILE
Aug 22 11:03:52 openqaworker22 systemd[1]: openqa-worker-cacheservice-minion.service: Failed with result 'exit-code'.
Updated by tinita 3 months ago
@ybonatakis the https://openqa.opensuse.org/tests/4422622#step/test_running/8 failure is unrelated. It's about #164886
Updated by jbaier_cz 3 months ago
This IS AppArmor issue which will be fixed in a while by https://github.com/os-autoinst/openQA/pull/5883. I need to double check I collected all the paths.
Updated by jbaier_cz 3 months ago
- Related to action #165692: [openQA-in-openQA] test does not fail if state is done but incomplete auto_review:"no candidate needle.*openqa-dashboard.*matched" size:S added
Updated by jbaier_cz 3 months ago
Now the failure should be not sporadic anymore as in https://openqa.opensuse.org/tests/4423883#step/test_running/8 :) The fix is already merged so as soon as the fixed openQA package is installable from our repositories this should go away.
Updated by jbaier_cz 3 months ago
Because I am blind and introduced a typo during rebase, we need another fix: https://github.com/os-autoinst/openQA/pull/5885
Updated by openqa_review 3 months ago
- Due date set to 2024-09-06
Setting due date based on mean cycle time of SUSE QE Tools