Project

General

Profile

Actions

action #50453

closed

action #50447: [aarch64] Update aarch64 machine on o3

[aarch64] Update aarch64 machine on o3 to Leap 15.1 once it is released

Added by ggardet_arm about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
2019-04-16
Due date:
% Done:

0%

Estimated time:

Description

A number of fixes for aarch64 are in Leap 15.1, so we should update this machine to Leap 15.1 once RC is reached.

Actions #1

Updated by ggardet_arm almost 5 years ago

zypper dup from 15.0 to 15.1 is now fixed, so we can update openQA worker from Leap 15.0 to 15.1

Actions #2

Updated by nicksinger almost 5 years ago

  • Subject changed from [aarch64] Update aarch64 machine on o3 to Leap 15.1 RC to [aarch64] Update aarch64 machine on o3 to Leap 15.1 once it is released
  • Status changed from New to Blocked

We, the tools team, decided to not upgrade this host with a not yet released version of Leap.
According to https://en.opensuse.org/openSUSE:Roadmap Leap 15.1 will be released soon and then we could take over the upgrade.

Actions #3

Updated by ggardet_arm almost 5 years ago

  • Status changed from Blocked to Workable

openSUSE Leap 15.1 is released, so we can update aarch64 worker on o3.

Actions #4

Updated by okurz almost 5 years ago

  • Status changed from Workable to Feedback
  • Assignee set to okurz
[03/06/2019 10:39:14] <guillaume_g> Martchus_, nsinger: Could we update aarch64 worker on o3 this week? https://progress.opensuse.org/issues/50453
[03/06/2019 11:19:11] <okurz> nsinger, Martchus_, guillaume_g: I guess I could try to upgrade the aarch64 o3 worker this week as requested by guillaume_g. In the end, we can try out the upgrade pretty safely using transactional *upgrade* and revert if we are not happy with it, any objections?
Actions #5

Updated by okurz almost 5 years ago

all are fine. Currently there are two jobs running on aarch64. I could do it now or later e.g. in the evening when the schedule is empty.

Actions #6

Updated by okurz almost 5 years ago

  • Status changed from Feedback to In Progress
sed -i -e 's/15\.0/15.1/g' /etc/zypp/repos.d/*.repo && \
for i in /etc/zypp/repos.d/*15.0*.repo ; do mv $i ${i/15.0/15.1}; done && \
transactional-update --interactive dup
Checking for newer version.
New version found - updating...
Loading repository data...
Reading installed packages...
Retrieving package transactional-update-2.14.2-lp151.1.1.aarch64                                                        (1/1),  59.1 KiB (145.5 KiB unpacked)
(1/1) /tmp/transactional-update.LlnLd6Yipj/openSUSE-Leap-15.1-1/aarch64/transactional-update-2.14.2-lp151.1.1.aarch64.rpm .............................[done]

download: Done.
transactional-update 2.14.2 started
Options: --interactive dup
Separate /var detected.
/etc on overlayfs detected.
Syncing /etc of oldest snapshot /.snapshots/242/snapshot as base into new snapshot /.snapshots/254/snapshot
Calling zypper --no-cd dup
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Loading repository data...
Reading installed packages...
Computing distribution upgrade...
6 Problems:
Problem: problem with installed package perl-JSON-Validator-3.11-lp150.2.1.noarch
Problem: problem with installed package perl-Minion-Backend-SQLite-4.002-lp150.2.1.noarch
Problem: problem with installed package perl-Mojo-IOLoop-ReadWriteProcess-0.23-lp150.1.1.noarch
Problem: problem with installed package perl-Mojo-SQLite-3.001-lp150.2.1.noarch
Problem: problem with installed package perl-Sereal-Decoder-4.005-lp150.9.2.aarch64
Problem: problem with installed package perl-Sereal-Encoder-4.005-lp150.9.2.aarch64

Problem: problem with installed package perl-JSON-Validator-3.11-lp150.2.1.noarch
 Solution 1: install perl-JSON-Validator-3.08-lp151.1.1.noarch (with vendor change)
  obs://build.opensuse.org/devel:openQA  -->  openSUSE
 Solution 2: keep obsolete perl-JSON-Validator-3.11-lp150.2.1.noarch

Choose from above solutions by number or skip, retry or cancel [1/2/s/r/c] (c): 1

…

The following 5 packages are going to be downgraded:
  cpp7               
    7.4.1+r270528-lp150.9.2 -> 7.4.0+r266845-lp151.1.3  aarch64  openSUSE-Leap-15.1-1   openSUSE                                         
  kernel-firmware    
    20190312-lp150.2.16.1 -> 20190118-lp151.1.10        noarch   Main Repository (OSS)  openSUSE                                         
  libgfortran4       
    7.4.1+r270528-lp150.9.2 -> 7.4.0+r266845-lp151.1.3  aarch64  openSUSE-Leap-15.1-1   openSUSE                                         
  perl-JSON-Validator
    3.11-lp150.2.1 -> 3.08-lp151.1.1                    noarch   Main Repository (OSS)  obs://build.opensuse.org/devel:openQA -> openSUSE
  perl-Mojo-SQLite   
    3.001-lp150.2.1 -> 3.000-lp151.2.2                  noarch   openSUSE-Leap-15.1-1   obs://build.opensuse.org/devel:openQA -> openSUSE

The following 2 packages are going to change architecture:
  grub2-arm64-efi  2.02-lp150.13.20.1 -> 2.02-lp151.20.2   aarch64 -> noarch  openSUSE-Leap-15.1-1   openSUSE
  kvm_stat         4.12.14-lp150.2.2 -> 4.12.14-lp151.6.1  aarch64 -> noarch  Main Repository (OSS)  openSUSE

The following 6 packages are going to change vendor:
  perl-JSON-Validator                3.11-lp150.2.1 -> 3.08-lp151.1.1    noarch   Main Repository (OSS)  obs://build.opensuse.org/devel:openQA -> openSUSE
  perl-Minion-Backend-SQLite         4.002-lp150.2.1 -> 4.002-lp151.1.1  noarch   Main Repository (OSS)  obs://build.opensuse.org/devel:openQA -> openSUSE
  perl-Mojo-IOLoop-ReadWriteProcess  0.23-lp150.1.1 -> 0.23-lp151.1.1    noarch   Main Repository (OSS)  obs://build.opensuse.org/devel:openQA -> openSUSE
  perl-Mojo-SQLite                   3.001-lp150.2.1 -> 3.000-lp151.2.2  noarch   openSUSE-Leap-15.1-1   obs://build.opensuse.org/devel:openQA -> openSUSE
  perl-Sereal-Decoder                4.005-lp150.9.2 -> 4.005-lp151.1.1  aarch64  openSUSE-Leap-15.1-1   obs://build.opensuse.org/devel:openQA -> openSUSE
  perl-Sereal-Encoder                4.005-lp150.9.2 -> 4.005-lp151.1.1  aarch64  openSUSE-Leap-15.1-1   obs://build.opensuse.org/devel:openQA -> openSUSE

Failed once as os-autoinst was not downloadable. Probably one of these unfortunate moments in time when the binaries on download.opensuse.org from an OBS output folder are replaced but the repo index was still pointing to it. So I gave it another try with transactional-update --interactive dup and that was fine. Triggered reboot.

Login after reboot worked but

# systemctl --failed
  UNIT               LOAD   ACTIVE SUB    DESCRIPTION                                                                                                       
● irqbalance.service loaded failed failed irqbalance daemon 

Also

● openqa-worker@1.service - openQA Worker #1
   Loaded: loaded (/usr/lib/systemd/system/openqa-worker@.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2019-06-03 12:25:25 CEST; 31s ago
  Process: 2408 ExecStart=/usr/share/openqa/script/worker --instance 1 (code=exited, status=0/SUCCESS)
 Main PID: 2408 (code=exited, status=0/SUCCESS)

Jun 03 12:25:23 openqa-aarch64 systemd[1]: Starting openQA Worker #1...
Jun 03 12:25:23 openqa-aarch64 systemd[1]: Started openQA Worker #1.
Jun 03 12:25:25 openqa-aarch64 worker[2408]: [info] CACHE: caching is enabled, setting up /var/lib/openqa/cache/openqa1-opensuse
Jun 03 12:25:25 openqa-aarch64 worker[2408]: [info] Project dir for host http://openqa1-opensuse is /var/lib/openqa/share
Jun 03 12:25:25 openqa-aarch64 worker[2408]: lscpu: cannot open /proc/cpuinfo: Permission denied
Jun 03 12:25:25 openqa-aarch64 worker[2408]: [info] registering worker openqa-aarch64 version 14 with openQA http://openqa1-opensuse using protocol version >
Jun 03 12:25:25 openqa-aarch64 worker[2408]: [error] ignoring server - server refused with code 400:
Actions #7

Updated by okurz almost 5 years ago

Temporarily disabling apparmor for the worker seems to be ok:

# aa-disable /usr/share/openqa/script/worker
Disabling /usr/share/openqa/script/worker.
# systemctl start openqa-worker@1 ; journalctl -f -u openqa-worker@1
-- Logs begin at Mon 2019-06-03 12:25:02 CEST. --
Jun 03 12:25:25 openqa-aarch64 worker[2408]: [error] ignoring server - server refused with code 400:
Jun 03 12:27:01 openqa-aarch64 systemd[1]: Starting openQA Worker #1...
Jun 03 12:27:01 openqa-aarch64 systemd[1]: Started openQA Worker #1.
Jun 03 12:27:02 openqa-aarch64 worker[2920]: [info] CACHE: caching is enabled, setting up /var/lib/openqa/cache/openqa1-opensuse
Jun 03 12:27:02 openqa-aarch64 worker[2920]: [info] Project dir for host http://openqa1-opensuse is /var/lib/openqa/share
Jun 03 12:27:02 openqa-aarch64 worker[2920]: lscpu: cannot open /proc/cpuinfo: Permission denied
Jun 03 12:27:02 openqa-aarch64 worker[2920]: [info] registering worker openqa-aarch64 version 14 with openQA http://openqa1-opensuse using protocol version [1]
Jun 03 12:27:03 openqa-aarch64 worker[2920]: [error] ignoring server - server refused with code 400:
Jun 03 12:28:19 openqa-aarch64 systemd[1]: Starting openQA Worker #1...
Jun 03 12:28:19 openqa-aarch64 systemd[1]: Started openQA Worker #1.
Jun 03 12:28:21 openqa-aarch64 worker[2963]: [info] CACHE: caching is enabled, setting up /var/lib/openqa/cache/openqa1-opensuse
Jun 03 12:28:21 openqa-aarch64 worker[2963]: [info] Project dir for host http://openqa1-opensuse is /var/lib/openqa/share
Jun 03 12:28:21 openqa-aarch64 worker[2963]: [info] registering worker openqa-aarch64 version 14 with openQA http://openqa1-opensuse using protocol version [1]
Jun 03 12:28:38 openqa-aarch64 worker[2963]: GLOB(0xaaaaddc9aa28)[info] got job 947659: 00947659-opensuse-Tumbleweed-DVD-aarch64-Build20190601-gnome@aarch64
Jun 03 12:28:38 openqa-aarch64 worker[2963]: [info] +++ setup notes +++
Jun 03 12:28:38 openqa-aarch64 worker[2963]: [info] start time: 2019-06-03 10:28:38
Jun 03 12:28:38 openqa-aarch64 worker[2963]: [info] running on openqa-aarch64:1 (Linux 4.12.14-lp151.28.4-default #1 SMP Fri May 24 07:57:46 UTC 2019 (af35fd1) aarch64)
Jun 03 12:28:43 openqa-aarch64 worker[2963]: [info] preparing cgroups to start isotovideo
Jun 03 12:28:43 openqa-aarch64 worker[2963]: Use of uninitialized value in subroutine entry at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/File.pm line 97.
Jun 03 12:28:43 openqa-aarch64 worker[2963]: [info] starting isotovideo container
Jun 03 12:28:43 openqa-aarch64 worker[2963]: [info] isotovideo has been started (PID: 2969)
Jun 03 12:28:43 openqa-aarch64 worker[2963]: [info] 2969: WORKING 947659

-> https://openqa.opensuse.org/tests/947659

# aa-complain /usr/share/openqa/script/worker
Setting /usr/share/openqa/script/worker to complain mode.
# systemctl restart openqa-worker.target
Actions #8

Updated by okurz almost 5 years ago

type=AVC msg=audit(1559557688.588:240): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="/usr/share/openqa/script/worker" pid=2959 comm="apparmor_parser"
type=AVC msg=audit(1559557840.869:241): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/share/openqa/script/worker" pid=3106 comm="apparmor_parser"
type=AVC msg=audit(1559557840.869:242): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/share/openqa/script/worker///usr/bin/Xvnc" pid=3106 comm="apparmor_parser"
type=AVC msg=audit(1559557840.869:243): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/share/openqa/script/worker///usr/bin/lscpu" pid=3106 comm="apparmor_parser"
type=AVC msg=audit(1559557840.869:244): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/share/openqa/script/worker///usr/bin/x3270" pid=3106 comm="apparmor_parser"
type=AVC msg=audit(1559557856.309:245): apparmor="DENIED" operation="open" profile="/usr/share/openqa/script/worker///usr/bin/lscpu" name="/proc/" pid=3158 comm="lscpu" requested_mask="r" denied_mask="r" fsuid=471 ouid=0
…
type=AVC msg=audit(1559557856.349:258): apparmor="DENIED" operation="open" profile="/usr/share/openqa/script/worker///usr/bin/lscpu" name="/proc/" pid=3171 comm="lscpu" requested_mask="r" denied_mask="r" fsuid=471 ouid=0
type=ANOM_ABEND msg=audit(1559557866.859:260): auid=4294967295 uid=471 gid=65533 ses=4294967295 pid=2975 comm="/usr/bin/isotov" exe="/usr/bin/perl" sig=11 res=1
type=AVC msg=audit(1559557875.789:261): apparmor="DENIED" operation="open" profile="/usr/share/openqa/script/worker///usr/bin/lscpu" name="/proc/" pid=3184 comm="lscpu" requested_mask="r" denied_mask="r" fsuid=471 ouid=0

I again disabled the apparmor profile but do not understand why some changes would be needed now. Anyone else has a hint?

Actions #9

Updated by ggardet_arm almost 5 years ago

It looks like lscpu is trying to read /proc/
yast apparmor should be helpful to check which update would be required.

Actions #10

Updated by ggardet_arm almost 5 years ago

Another issue since the update: https://openqa.opensuse.org/tests/947703/file/autoinst-log.txt ISO is not downloaded properly but should be available, according to https://openqa.opensuse.org/admin/assets

Actions #11

Updated by okurz almost 5 years ago

  • Status changed from In Progress to Feedback

@ggardet_arm that was a mistake by myself while working on #52499, you can ignore that.

That should be covered by https://github.com/os-autoinst/openQA/blob/master/profiles/apparmor.d/usr.share.openqa.script.worker#L159 already and I do not see that there would be any like "old override" file in place which could explain that a profile is not up-to-date.

so any other suggestions or ideas?

Actions #12

Updated by ggardet_arm almost 5 years ago

okurz wrote:

That should be covered by https://github.com/os-autoinst/openQA/blob/master/profiles/apparmor.d/usr.share.openqa.script.worker#L159 already and I do not see that there would be any like "old override" file in place which could explain that a profile is not up-to-date.

so any other suggestions or ideas?

I think it currently tries to open /proc/ folder (to list the files?), where as the profile allow to open (r mode) any file inside /proc/. But I am not an apparmor expert. ;)
So maybe add
/proc/ r,
?

Actions #13

Updated by okurz almost 5 years ago

yes, you are right:

Updated this locally on aarch64.o.o and did

aa-complain /usr/share/openqa/script/worker; systemctl restart openqa-worker@1 ; journalctl -f -u openqa-worker@1

and this works.

https://github.com/os-autoinst/openQA/pull/2088

Actions #14

Updated by ggardet_arm almost 5 years ago

@okurz, it seems the worker is working properly, so we could close this ticket? Or do you prefer to check for a longer period of time?

Actions #15

Updated by okurz almost 5 years ago

  • Status changed from Feedback to Resolved

yes, we are good. We can close the ticket.

Actions

Also available in: Atom PDF