Project

General

Profile

action #103683

[tools][sle][x86_64][aarch64][QEMUTPM] install package "swtpm" on x86_64 and aarch64 workers

Added by rfan1 about 2 months ago. Updated 5 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2021-12-08
Due date:
2022-01-14
% Done:

0%

Estimated time:

Description

Observation

Hello openQA experts:
When I tried to start a vm within openQA with swtpm device attached, I found that the "swtpm" package is not installed,
Can you please help check and fix it?

https://openqa.suse.de/tests/7810288#details

Steps to reproduce

Job settings:
QEMUTPM: 1
QEMUTPM_VER: 1.2

Problem

Result: incomplete, finished about 5 hours ago (01:59 minutes)

Reason: backend died: open3: exec of swtpm socket --tpmstate dir=/tmp/mytpm1 --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock --log level=20 -d failed: No such file or directory at /usr/lib/perl5/5.26.1/IPC/Open3.pm line 283.

Suggestion

Install package "swtpm" should fix the problem

Workaround

n/a


Related issues

Related to openQA Infrastructure - action #99192: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.3 size:MResolved

Blocked by openQA Infrastructure - action #104673: Access to o3 workers is not well-documented and not automatedResolved2022-01-24

History

#1 Updated by okurz about 2 months ago

  • Related to action #99192: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.3 size:M added

#2 Updated by cdywan about 2 months ago

  • Description updated (diff)

#3 Updated by cdywan about 2 months ago

I prepared a draft to add the package. The package doesn't seem to be built, though 🤔️

#4 Updated by okurz about 2 months ago

  • Project changed from openQA Project to openQA Infrastructure
  • Description updated (diff)
  • Status changed from New to Blocked
  • Assignee set to okurz
  • Target version set to Ready

The package swtpm is currently not available for openSUSE Leap 15.2 which we still run within the OSD infrastructure. Leap 15.3 has the package, same as Tumbleweed, hence blocking on #99192 .
rfan1 if you want to push this faster then you can try if the package "swtpm" builds on Leap 15.2, otherwise this will likely take some weeks/months

#5 Updated by rfan1 about 2 months ago

okurz I am able to find the package at https://software.opensuse.org/download/package?package=swtpm&project=security

For openSUSE Leap 15.2 run the following as root:

zypper addrepo https://download.opensuse.org/repositories/security/openSUSE_Leap_15.2/security.repo
zypper refresh
zypper install swtpm

Not sure it can help.

#6 Updated by rfan1 about 1 month ago

cdywan,

Seems workers have be upgraded to leap 15.3 already. then swtpm is available,right?

#7 Updated by okurz about 1 month ago

  • Status changed from Blocked to Feedback

cdywan I think the package "os-autoinst-swtpm" does not build because we miss the files section for the subpackage in the spec file. https://github.com/os-autoinst/os-autoinst/pull/1888 should fix it

#8 Updated by okurz about 1 month ago

  • Status changed from Feedback to Workable
  • Assignee changed from okurz to cdywan

PR merged, https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/620 as well. Multiple failures from deployment: https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/756835#L1980

cdywan I think you opted to only build the package for x86_64 but try to install it on all workers, e.g. also ppc64le. At best build os-autoinst-swtpm for all architectures. If not possible then only install on x86_64 workers

#9 Updated by cdywan about 1 month ago

  • Status changed from Workable to In Progress

Ack, seems like it won't install on malbec and grenache-1, and arm workers also seem problematic but I won't copy all of what looks to be extremely redundant logs:

malbec.arch.suse.de:
----------
          ID: worker.packages
    Function: pkg.installed
      Result: False
     Comment: Attempt 1: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r0b6f1e79c7ca408ea42c92df617f3705.scope
              Package 'os-autoinst-swtpm' not found."
              Attempt 2: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r09bb7ec760604d469c2ff84ab94abaa6.scope
              Package 'os-autoinst-swtpm' not found."
              Attempt 3: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-rf449a9a09402482a933348895d1ab5ea.scope
              Package 'os-autoinst-swtpm' not found."
              Attempt 4: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r9081f72ddab74171b4ceef3f6b19e83c.scope
              Package 'os-autoinst-swtpm' not found."
              An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-rd06655afe9314b3eb1347c4e6ac739fc.scope
              Package 'os-autoinst-swtpm' not found.
     Started: 11:01:57.612737
    Duration: 133217.624 ms
     Changes:   

     Result: False
     Comment: Attempt 1: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r180d91214ae04592bd6bcd1dad27a3ca.scope
              Package 'os-autoinst-swtpm' not found."
              Attempt 2: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r5a69c1f9699a4e0281cb6b293160ea53.scope
              Package 'os-autoinst-swtpm' not found."
              Attempt 3: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r29817aef9a1c454f8fda34c67d5fadf8.scope
              Package 'os-autoinst-swtpm' not found."
              Attempt 4: Returned a result of "False", with the following comment: "An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-r007eb7f824aa473ca757e9d5ffb4dae1.scope
              Package 'os-autoinst-swtpm' not found."
              An error was encountered while installing package(s): Zypper command failure: Running scope as unit: run-rdb51ab32c70f44e088a2f3057841f603.scope
              Package 'os-autoinst-swtpm' not found.
     Started: 11:02:00.386960
    Duration: 134652.76400000002 ms
     Changes:   

okurz wrote:

cdywan I think you opted to only build the package for x86_64 but try to install it on all workers, e.g. also ppc64le. At best build os-autoinst-swtpm for all architectures. If not possible then only install on x86_64 workers

The spec doesn't exclude any architecuture afair and I see aarch64 and ppc64le on devel:openQA for 15.3 🤔️

swtpm also installs fine on malbec, so it wouldn't seem like deps should have been a problem

#10 Updated by cdywan about 1 month ago

cdywan wrote:

The spec doesn't exclude any architecuture afair and I see aarch64 and ppc64le on devel:openQA for 15.3 🤔️

swtpm also installs fine on malbec, so it wouldn't seem like deps should have been a problem

The files are indeed not there:

https://download.opensuse.org/repositories/devel:/openQA/openSUSE_Leap_15.3/ppc64le/

Ah but of course, okurz your PR placed the files in x86_64. So regardless of it being built on all architectures, it's not there :-D

https://github.com/os-autoinst/os-autoinst/pull/1891

#11 Updated by openqa_review about 1 month ago

  • Due date set to 2022-01-06

Setting due date based on mean cycle time of SUSE QE Tools

#12 Updated by rfan1 about 1 month ago

I can see that the swtpm package is there.

however, later test failed with some permission denied issue: [Only on O3, OSD worked fine]

https://openqa.opensuse.org/tests/2100224#details

[2021-12-23T06:21:25.301401+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
  Can't mkdir('/tmp/mytpm1'): Permission denied at /usr/lib/os-autoinst/backend/qemu.pm line 520
[2021-12-23T06:21:25.301765+01:00] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json

#13 Updated by okurz about 1 month ago

rfan1 wrote:

however, later test failed with some permission denied issue: [Only on O3, OSD worked fine]

https://openqa.opensuse.org/tests/2100224#details

[2021-12-23T06:21:25.301401+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
  Can't mkdir('/tmp/mytpm1'): Permission denied at /usr/lib/os-autoinst/backend/qemu.pm line 520
[2021-12-23T06:21:25.301765+01:00] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json

Apparmor profiles for the worker need to be extended

#14 Updated by cdywan about 1 month ago

rfan1 wrote:
however, later test failed with some permission denied issue: [Only on O3, OSD worked fine]

Confirmed. I ran the test from #101015 on malbec since that's the machine I was checking the package availability on earlier. And it has /usr/share/openqa/script/worker { ... /tmp/* rwk, ... } which is what's in openQA master.

I don't know what oss-cobbler-03 is, though, which your o3 job ran on. I can't get to it from my machine or o3 🤔️ It also seems to have no unique worker class meaning I couldn't do a smoke test on it even if I knew how to login.

So, why doesn't this machine probably have the apparmor rules we have in git?

#15 Updated by rfan1 19 days ago

cdywan wrote:

rfan1 wrote:
however, later test failed with some permission denied issue: [Only on O3, OSD worked fine]

Confirmed. I ran the test from #101015 on malbec since that's the machine I was checking the package availability on earlier. And it has /usr/share/openqa/script/worker { ... /tmp/* rwk, ... } which is what's in openQA master.

I don't know what oss-cobbler-03 is, though, which your o3 job ran on. I can't get to it from my machine or o3 🤔️ It also seems to have no unique worker class meaning I couldn't do a smoke test on it even if I knew how to login.

So, why doesn't this machine probably have the apparmor rules we have in git?

As discussed with internal team members, we didn't know how to access the o3 workers.
okurz Do you know who is the right person maintaining the o3 workers?

BR//Richard.

#16 Updated by okurz 18 days ago

It's likely maintained by ggardet_arm, "guillaume_g" in irc://libera.chat/opensuse-factory

#17 Updated by cdywan 18 days ago

  • Status changed from In Progress to Feedback

I notice that the job didn't pass before:

Could not open 'opensuse/lib/../schedule/security/tpm_tools.yaml' for reading: No such file or directory at /usr/lib/perl5/vendor_perl/5.26.1/YAML/PP/Lexer.pm line 139.

Interestingly there's no alerts even though the worker in question went offline 🤔️ I'm inclined to assume this is somebody's personal machine and not part of our infra.

Worker oss-cobbler-03:7
Host:oss-cobbler-03
Instance:7
Seen:a day ago
Status: Offline

The same job also recently ran on ip-10-0-0-58:3
and openqa-aarch64, though, so I'm guessing there is something wrong on o3 if several works don't have the correct apparmor profiles...

I can't login on any of the machines, though, even via ariel.

#18 Updated by ggardet_arm 18 days ago

That's expected I guess, since schedule/security/tpm_tools.yaml is not part of https://github.com/os-autoinst/os-autoinst-distri-openSUSE

#19 Updated by cdywan 18 days ago

ggardet_arm wrote:

That's expected I guess, since schedule/security/tpm_tools.yaml is not part of https://github.com/os-autoinst/os-autoinst-distri-openSUSE

So should I simply disregard any issues here, if the test is not really maintained? As stated above I have no way of investigating why the apparmor profiles don't seem to match what's in git.

#20 Updated by ggardet_arm 18 days ago

I installed os-autoinst-swtpm on the following workers:

  • openqa-aarch64 (on openSUSE network)
  • oss-cobbler-03 (Remote worker)

It is not installed on ip-10-0-0-58 (remote worker) because there is no such package available in repos (SLE15-SP2 machine).

#21 Updated by rfan1 17 days ago

Thanks all for the kindly help!

I can see the latest run failed again with worker 'openqaworker7':
https://openqa.opensuse.org/tests/2123351

#22 Updated by Xiaojing_liu 17 days ago

rfan1 wrote:

Thanks all for the kindly help!

I can see the latest run failed again with worker 'openqaworker7':
https://openqa.opensuse.org/tests/2123351

This job failed in backend died: Can't mkdir('/tmp/mytpm1'): Permission denied at /usr/lib/os-autoinst/backend/qemu.pm line 520. In 'openqaworker7', there are two 'mkdir',

  openqaworker7:~ # whereis mkdir
  mkdir: /usr/bin/mkdir /bin/mkdir /usr/share/man/man1/mkdir.1.gz
  openqaworker7:~ # ll -h /bin/mkdir
  lrwxrwxrwx 1 root root 14 Sep 24 15:51 /bin/mkdir -> /usr/bin/mkdir

And we define the permission of /usr/bin/mkdir rix in 'usr.share.openqa.script.worker'. So I guess when we call 'mkdir' in the code, the '/bin/mkdir' is called, then it reports 'permission denied'. Maybe we could call '/usr/bin/mkdir' directly in the code.

#23 Updated by cdywan 17 days ago

Xiaojing_liu wrote:

rfan1 wrote:

Thanks all for the kindly help!

I can see the latest run failed again with worker 'openqaworker7':
https://openqa.opensuse.org/tests/2123351

This job failed in backend died: Can't mkdir('/tmp/mytpm1'): Permission denied at /usr/lib/os-autoinst/backend/qemu.pm line 520. In 'openqaworker7', there are two 'mkdir',

  openqaworker7:~ # whereis mkdir
  mkdir: /usr/bin/mkdir /bin/mkdir /usr/share/man/man1/mkdir.1.gz
  openqaworker7:~ # ll -h /bin/mkdir
  lrwxrwxrwx 1 root root 14 Sep 24 15:51 /bin/mkdir -> /usr/bin/mkdir

And we define the permission of /usr/bin/mkdir rix in 'usr.share.openqa.script.worker'. So I guess when we call 'mkdir' in the code, the '/bin/mkdir' is called, then it reports 'permission denied'. Maybe we could call '/usr/bin/mkdir' directly in the code.

Thank you for taking a look, Jane! This was a really good clue. I wasn't even thinking of the symlink situation - so this a bit like #99195#note-18 where the destination of /bin/sh changed in 15.3. And I keep forgetting we don't enable apparmor on osd.

https://github.com/os-autoinst/openQA/pull/4433

#24 Updated by cdywan 17 days ago

cdywan wrote:

Xiaojing_liu wrote:

This job failed in backend died: Can't mkdir('/tmp/mytpm1'): Permission denied at /usr/lib/os-autoinst/backend/qemu.pm line 520. In 'openqaworker7', there are two 'mkdir',

Something Tina brought up in the daily, even if mkdir is accessible subfolders under /tmp won't be. Which is why I think we also need the pattern fixed:

https://github.com/os-autoinst/openQA/pull/4434

#25 Updated by cdywan 17 days ago

  • Copied to action #104673: Access to o3 workers is not well-documented and not automated added

#26 Updated by cdywan 17 days ago

  • Due date changed from 2022-01-06 to 2022-01-14
  • Status changed from Feedback to Blocked

I was previously thinking investigating one unknown worker shouldn't be an issue, but after feeling a bit lost with several machines I filed #104673, and I'm bumping the due date and marking this ticket as blocked.

#27 Updated by cdywan 17 days ago

  • Copied to deleted (action #104673: Access to o3 workers is not well-documented and not automated)

#28 Updated by cdywan 17 days ago

  • Blocked by action #104673: Access to o3 workers is not well-documented and not automated added

#29 Updated by tinita 17 days ago

Regarding /bin/mkdir vs. /usr/bin/mkdir - what apparmor needs to be allowed is the actual real path. It doesn't care about the symlinks.
See https://progress.opensuse.org/issues/99195#note-17

Also, for the bash issue https://progress.opensuse.org/issues/99195#note-8, the error message was:

Can't exec "/bin/sh": Permission denied at /usr/share/openqa/script/../lib/OpenQA/Task/Job/FinalizeResults.pm line 63.

The Can't exec comes from apparmor, while the Can't mkdir comes from the mkdir syscall. I would expect Can't exec "/usr/bin/mkdir" if it was an apparmor issue here.

#30 Updated by cdywan 16 days ago

tinita wrote:

Regarding /bin/mkdir vs. /usr/bin/mkdir - what apparmor needs to be allowed is the actual real path. It doesn't care about the symlinks.
See https://progress.opensuse.org/issues/99195#note-17

Also, for the bash issue https://progress.opensuse.org/issues/99195#note-8, the error message was:

Can't exec "/bin/sh": Permission denied at /usr/share/openqa/script/../lib/OpenQA/Task/Job/FinalizeResults.pm line 63.

The Can't exec comes from apparmor, while the Can't mkdir comes from the mkdir syscall. I would expect Can't exec "/usr/bin/mkdir" if it was an apparmor issue here.

I assume we only see one or the other, even if both apply. Thank you for elaborating, though, my comments were rather too concise.

#31 Updated by tinita 16 days ago

Regarding mkdir, I think perl's mkdir doesn't use the /usr/bin/mkdir at all, but a syscall. if I remove it from my PATH variable, I can still execute mkdir in a perl script.

#32 Updated by cdywan 11 days ago

  • Status changed from Blocked to Feedback

And we define the permission of /usr/bin/mkdir rix in 'usr.share.openqa.script.worker'. So I guess when we call 'mkdir' in the code, the '/bin/mkdir' is called, then it reports 'permission denied'. Maybe we could call '/usr/bin/mkdir' directly in the code.

swtpm wasn't installed, so I changed that now. /etc/apparmor.d/usr.share.openqa.script.worker is uptodate.
Same for openqaworker{1,4}|power8|imagetester|rebel.

#33 Updated by rfan1 10 days ago

New issue hit now:

https://openqa.opensuse.org/tests/2133810#details

Reason: backend died: open3: exec of swtpm socket --tpmstate dir=/tmp/mytpm1 --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock --log level=20 -d failed: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 127.

#34 Updated by cdywan 9 days ago

rfan1 wrote:

New issue hit now:

https://openqa.opensuse.org/tests/2133810#details

Reason: backend died: open3: exec of swtpm socket --tpmstate dir=/tmp/mytpm1 --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock --log level=20 -d failed: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 127.

https://github.com/os-autoinst/openQA/pull/4449

#35 Updated by rfan1 6 days ago

cdywan wrote:

rfan1 wrote:

New issue hit now:

https://openqa.opensuse.org/tests/2133810#details

Reason: backend died: open3: exec of swtpm socket --tpmstate dir=/tmp/mytpm1 --ctrl type=unixio,path=/tmp/mytpm1/swtpm-sock --log level=20 -d failed: Permission denied at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 127.

https://github.com/os-autoinst/openQA/pull/4449

cdywan

Thanks for the quick fix, it can pass now.
https://openqa.opensuse.org/tests/2137553

#36 Updated by cdywan 5 days ago

  • Status changed from Feedback to Resolved

rfan1 wrote:

Thanks for the quick fix, it can pass now.
https://openqa.opensuse.org/tests/2137553

Thank you for confirming, I assume this is finally solved then.

Also available in: Atom PDF